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I. INTRODUCTION 


A. LITERATURE BACKGROUND 

This thesis is basically developed from the paper "An 8 xX 8 Discrete Cosine 
Transform Chip with Pixel Rate Clock" by D’Luna, L. J. [Ref. 1]. The original paper 
introduced the algorithm and implementation of one-dimensional (1-D) as well as two- 
dimensional (2-D) Discrete Cosine Transform (DCT) where the principle of distributed 
arithmetic is used. According to the algorithm introduced, hardware circuit architecture 
was implemented. 

Another very important aspect discussed in this thesis is the implementation of a 
"Top-Down Design" concept that uses Very High Speed Integrate Circuit (VHSIC) 
Hardware Description Language [Ref. 4-8] as a tool. "Top-Down Design" is a kind of 
design that descnbes the given algonthm with a high level language first. After the 
algorithm is described, the structural architecture 1s described next. Finally this structural 
description is developed into hardware circuit. WHDL facilitates the algorithm 


description, structural description as well as hardware circuit simulation. 


B. OBJECTIVE 
The purpose of this thesis is to describe the behavior of the implemented 
architecture of the algorithm mentioned above with VHSIC Hardware Description 


Language (VHDL). It was simulated on a workstation in order to analyze the 


characteristics. In the process of describing the behavior of this structural architecture, 
complicated hardware circuits are developed in behavior models. This is usually the first 
step in a "Top-Down Design" task. The objective is to use a DCT implementation as an 


example to study the "Top-Down Design" methodology. 


C. RATIONALE FOR USING VHDL TO DESCRIBE THE CIRCUIT 

In the past, VHSIC design was dominated by bottom-up design methodologies 
where hardware circuit details were established and produced before the system was 
constructed [Ref. 4]. This methodology is very useful in dealing with small circuits. 
However, when the system gets complicated, bottom-up design methodology is more 
difficult to handle. In this work, a high-level, top-down design approach is taken. 
Initially, a description of the algorithm is written. Later on, a detailed architecture is 
described. All are done in VHDL. VHDL is a hierarchical hardware description language 
which supports mixed-level simulation. This thesis shows the beginning steps for a "Top- 
Down Design" approach. The 8 X 8 image block DCT algorithm were implemented into 
a behavior model and a structural model. VHDL were used here to accomplish the initial 


design of the 1-D Discrete Cosine Transform implementation. 


D. OVERVIEW OF THE THESIS 

There are six chapters in this thesis. The first chapter is an introduction to the 
literature background, the objective, and the reasons for using the VHDL. Chapter II 
introduces the algorithm of Discrete Cosine Transform and the principle of distributed 


arithmetic. Chapter III examines the components of the structural architecture. Chapter 


IV gives the actual VHDL behavioral description of the components, its actual circuit 
block diagram, and its connections. Chapter V analyzes the simulation results and gives 


some experience on design problems. The last chapter is the conclusion. 


II. BASIC DISCRETE COSINE TRANSFORM THEORY 


A. DISCRETE COSINE TRANSFORM IN IMAGE COMPRESSION 


1. Rationale for using Discrete Cosine Transform 

Image transmission or storage usually deals with a large amount of digital 
data. There are usually 512 X 512 pixels in a monochrome picture. If one pixel needs 
8 bits to represent its information, transmitting a monochrome picture means that more 
than two megabits (512 X 512 X 8 = 2,097,152) of digit data need to be transmitted. 
There are many ways to do coding, compressing huge amounts of data to reduce the 
transmission bandwidth and the amount of storage space required. Among these methods, 
transform domain compression is an effective way to eliminate the redundant information 
in images, since image data are usually highly correlated. 

Image transformation is used to extract a small number of significant 
coefficient values from the original image, by mapping the image data onto a two- 
dimensional spectrum. Each coefficient in the transform domain represents some amount 
of energy of the spectral component. The original spatial image can then be recovered 
back from these coefficients, since each image has its own specific spectral pattern. After 
the transformation, there are only a few coded values required to describe the original 
image. Consequently, it is possible to save bits during transmission and storage. 

The Fourier transform algorithm has been applied to image processing for a 


long time, since it possesses many desirable analytic properties. But, it has two major 


drawbacks. First, the computation of the Fourier transform involved complex numbers 
rather than real numbers. Secondly, the decreasing rate of spectrum energy as frequency 
increases is low. This low decreasing rate in the spectrum is a very significant 
disadvantage in image coding. 

The Discrete Cosine Transform (DCT) has the advantage of involving only 
real number computations. It is well suited for image data compression. Consequently, 
8 X 8 image blocks of two dimensional cosine transforms have been adopted as an 
international standard draft (JPEC) [Ref. 1]. This thesis concentrates on studying the 


Discrete Cosine Transform and building a circuit for 8 = 8 image blocks. 


2. Formulae of the Discrete Cosine Transform 
The general formula of a one-dimensional Discrete Fourier Transform (1-D 


DCT) is expressed as 


Z, = XiCiy (1) 


where Z, is the transform of X;, C, is the forward transformation kernel, and i and k 


range from 0 to N - 1. The inverse transform of the 1-D DCT is given by the relation 


N-1 


X; = d 2H (2) 
k=0 


where hf, is the inverse transformation kernel. The characteristic of the transform is 


determined by its transformation kernel properties. 


The 1-D DCT forward kernel is given by 


1 
— 3) 
"UN 
Ca = | 2eo = aR = . 


Substituting Eq. (3) and (4) into Eq. (1) yields 


1 N-1 
a (5) 
JN i-0 
2 ay. (2% + I)kr 6 
Z ea <a eX cog ee 
. N 2, © 2N 


where Z,, kK = 0, 1, 2, ... , N- 1, is the 1-D DCT of X/(i). 
The inverse kernel is of the same form as Eq. (3) and (4), so that the inverse 


DCT is expressed by the equation 


yee 25% os Sh (7) 


where 7 = 0,1, 2,..., N- 1. 


The two-dimensional forward DCT kernel is given as 


(8) 


2 (2i + kn yp iT 
C.., = —[cos-—————][cos—+___— (9) 
hs vi 2N i 2N . 
where 1, / = 0,1, ..., N- 1, andk, / = 1, 2, ... , N- 1. The inverse kernel is also of 


this form. Thus, the two-dimensional DCT pair is expressed by 
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where i, / 
It can be seen that DCT transformation kernels are separable from Eqs. (3), 
(4), (8), and (9). Therefore, the two-dimensional forward or inverse transformation can 


be computed by applying two one-dimensional DCT operations successively. 


B. ALGORITHM FOR 8 BY 8 IMAGE DISCRETE COSINE TRANSFORM 
1. Methodology of 2-D DCT 
Let x, denote an image pixel value, which is an n-bit number. The indices 
i and j represent the row and column location of the pixel, respectively. The N X N 


two-dimensional DCT can be expressed by 


Ze = ADS a3) 
00 N+ é ij 
J=0 i=0 
N-1 N-1 ; 
2 (2+ Dkn (27 + Ln 
Tb 2 = 6.005 eee oe (14) 
Z Noe be y 2N 2N a 


Z,, is the spectral coefficient corresponding to the k” horizontal frequency and /” vertical 
frequency. In matrix notation, the inner summation is equivalent to a 1-D DCT 
computation on the columns of X. The outer summation is equivalent to a 1-D DCT 
computation on the rows of the inner summation results. C can be used to represent the 
2-D DCT matrix. It has the 1-D DCT basis vectors which are elements C,,, (1-D DCT 


kernels), where 


C= m = 0,1,2,...,N-1 (15) 


Ce = | Zoo @e = te (16) 
N 2N 


m = 0, l, 2, ... , N-l; k = 1, 2, ... , N-l. Because the kernels of the DCT 
transformation can be separated, the 2-D matrix Z of 2-D DCT coefficients can be 


represented as 


Z = [x'Cy'C = C’'XC. (17) 


This process can be realized in an architecture shown in Fig. 1 (referred to Ref. 1]). 
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Fig. 1 2-D DCT Block Diagram 
The N X N block of image X is input column by column first, and the 1-D DCT 
computation is done. This computation is carried out as shown in the square bracket of 
Eq.(17) for the j“ column (for j = 0, 1, ...., N- 1). The result of this N < N matrix 


is then transposed for the second row by row 1-D DCT computation. This transpose is 


done as described by term on the outside of the square brackets in Eq.(17). After the 
transposition, the same 1-D DCT computation involving the same transform matrix C is 
carried out again. The transpose step takes care of the column to row change operations 
of the data. The key operations involved here are the matrix transpose and the 1-D DCT 


computation. 


2. Principle of distributed arithmetic 
The implementation of the 1-D DCT studied here is based on the principle 
of distributed arithmetic. Using this principle, it is possible to implement the "bit 
calculation" into the chip design. "Bit multiplication” is simply carried out by using the 
input data bit pattern to address a Read Only Memory and by summing up all the results 
to obtain the "transposed spectral values". If Y, (Y, = (VJm-o. ) iS the image pixel vaiue 


represented by a row vector, then its 1-D DCT is 
Z =o Ce Onl oe Nees (18) 
m=0 
Now the input data y,, can be represented in 2’s complement notation with 
p-bit as 


el = ya Dop- ~1 Sige (19) 


where y,,” is the qg” bit of the incoming image pixel values y,, which have a value of 


either 0 or 1. 2? is the binary weight of the qg” bit. For example, if the input data is a 2’s 


10 


complement 8-bit pattern then y,, = -Yin” X 2’? tym” X 2 +y,% x 2) + y,,7 x 2? 


+ Ym? XP +y.% x 4+ y_ X 2 + y,,% x 2° Substituting Eq. (19) into Eq. (18) 


N-1 p-2 N-1 
cia ey 2 a) Coy in 2° (20) 
m=0 q=0 m=0 
p-2 
bgp es Ee (Coma oue. fe Se Ee (Gee) 28 (21) 
q=0 


where F,, is a function of the vectors C, and Y,” and is represented as 


N-1 
1 (CSG = 9 oma) ai for qu= 0,1,2....,.p-1. (22) 
m=0 


Its binomial form can be shown as 


q) ( 
F(CpY) = cyy @ oy? er eee ryios (23) 


where, q = 0, 1, ... , p-l. 


3. Methodology for forming the ROM storage 
In Eq.(23), c,, are 1-D DCT basis (kernels) vectors used as multiplication 
coefficients. They are converted from decimal numbers to the 2’s complement notation 
used in this thesis. y,, are the bit patterns represented in 2’s complement form of the 
N data points y,,. Because the basis vectors are fixed value coefficients and Fy, are 


functions of the basis vectors and the binary bit patterns, the values of F, (with a fixed 


iM 


k) for all possible N bit patterns (y,,°” m = 0, 1, 2, ..., N - 1) can be calculated and 
stored in Read Only Memory (ROM) according to Eq.(22) and Eq.(23). The N-bit 
pattern changes with time according to the incoming data y,,"” (m = 0, 1, 2,..., N-1). 
This bit pattern will form an address to access the ROM to extract the corresponding 
F,(C,, Y{%) value. 

From Eq.(20) and Eq.(21), the corresponding 1-D DCT spectral coefficient 
Z,, can be computed by shifting and adding the F,, values stored in the ROM. In Eq. 
(21), F, is a function of the corresponding basis column vector C, for k = QO, 1, 2, ..., 
N-1. F,, is different from each other as k varies. The incoming data vector Y, is the same 
for the multiplication coefficients involved for all values of k. It is possible to build up 
N separate memory banks of multiplication coefficients and compute the N 1-D DCT 


spectral coefficients Z, (k = 0, 1, 2,..., N-1) in parallel or concurrently. 


4. Exploiting the symmetry in DCT to save storage in ROM 
Here, 8 X 8 image blocks are used, so N = 8. The incoming data has 8 bits. 
This means 2° = 256 possible bit patterns will be formed into addresses. There shall be 
256 corresponding multiplication coefficient sum stored in the ROM for each of the 8 
DCT spectral coefficients. However, advantage can be taken of the symmetry in the DCT 


basis vectors. It can be shown that 


C= Cy od for k = 0,2,....N-2 (k even). (24) 


For example, 
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ss a (25) 


where c,, 1S defined by Eq. (15) and Eq. (16). And the following can be proven, 


CPO 5 for k = 1, 3, .., N-1 (k odd) (26) 
For example, 
Coy = 2 cos = — 2 cos 13t — -C,,. (27) 
8 16 8 16 


Hence, Eq. (18) can be reduced to 


Nf2-1 
Ze = Dd Oim * Yin-1-m © mk 
m=0 
where k = 0, 2,..., N-2 (k even) (28) 


and, 


1) 


Nf2-1 


Zy = d, en) © mk 


where k = 1, 3,..., N-1 (k odd). (29) 


Equations (22) and (23) then can be reduced to 


Nf2-1 


F Gey) =) CRO a yaya) © 
m=0 
where k = 0, 2, 4,..., N-2 (30) 
N[2-1 


Feet) ~ yy Ct Lim 7 ania 
m=0 


where k = 1, 3, 5,..., N-1. (31) 


From the above equations, it is possible to add or subtract the incoming data 
points before memory access and reduce the number of distinct data values in ROM from 
N to N/2. The total number of bit patterns is now only 2%? = 2* = 16. Only a 16 word 
ROM is necessary for each of the 8 DCT coefficients, and therefore a total of 16 x 8 
= 128 word ROM is required. This savings of ROM storage is significant compared to 
the cost of using adders and subtractors in a different architecture. Since there is only 
One particular bit pattern (those bits which have the same binary weight) at a time 


allowed to address the ROM, and bit pattern changes according to the serially coming 


data, the addition and subtraction can be done in a bit serial fashion. This advantage is 


exploited in the chip implementation discussed in the next chapter. 
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Il. A STRUCTURAL ARCHITECTURE FOR THE 1-D DCT 


A. 8 x 8 IMAGE BLOCK 1-D DCT CIRCUIT ARCHITECTURE 


The 1-D DCT architecture studied previously is shown in Fig. 2 [Ref. 1]. There 
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Fig. 2 Architecture of 1-D DCT 


are 8 slices parallel to each other corresponding to the 8 DCT coefficients which are 
computed concurrently. First, 12-bit pixels AI(11:0) are put column by column into the 


"serial-in-parallel-out" shift register (A). This sequence needs 8 clock cycles to complete. 
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After the 8" clock, the shift registers output the data into the “parallel load 2-bit serial 
shift register" (B) at once. This is completed at the 9" clock cycle. At the same time, the 
serial-in-parallel-out shift registers also get their new incoming data. The data stored in 
the B shift register has to be added or subtracted according to Eqs. (30) and (31) in order 
to reduce the ROM storage. In order to make Eqs. (30) and (31) more understandable, 


they are expanded as below 


BO wee Yi Ce (Vote Veg) 
m=0 


m =0 m = 1 m = 2 m = 3 
= Cool Yiot Yi) +C io Yi + Yc) + Cro Vio t+ Yis) + Cao Yi3 + Y,,)----k = 0 
+C(Yiot Yi7)%+C (Yi + Vig) + Cr Vin + Yi) +Ca9( V3 + Yy)----k = 2 
+Co(¥iot Yi) +Cyg( Vi + Vig) + Cag Yin + Yis) +Ca,(¥i3 + Y;,)%----k = 4 


+ Cos Yiot Viz) +Cy6( Vir + Vig) + Coe Vin + Vis) +Cye( Vis + Yis)----k = 6 
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= Co (Yio Yin) + Ci (Vir- Yin) + Cor (¥in- Vis) + Cai(Yin- Ys) ----k = 1 
= Cos Yio Yin) + Cia Yin Vie) + Cos ¥in- Vis) + Ca Vis Yu)'9----k = 3 
=Cos(Yior Viz) + Cis (Vir Vis)? + Cos Yi- Vis) + Ca5(Yis- Yun)" P----k = 5 


= Cop Yio ¥ 7) + Cy Vin Vis) + Cop Vir Vis) + Ca Vi Yin)P----k = 7 


Vy 


The numbers above the expanded equation represent the index m, and the numbers on 
the right side are the index k. C,, are multiplication coefficients. The bit 
addition/subtraction is determined according to whether k is an even or odd number. 

Registers B must be emptied in less than 8 clock cycles in order to receive new 
data coming from registers A. Each datum is 12 bits in length. If a single bit is coming 
out of registers B, it will take 12 clock cycles to empty the register. This will cause 
collision during the addition and subtraction of the data. There are two ways to solve this 
problem ; either to clock register B twice as fast or to shift out data 2 bits at a time. The 
latter alternative has been chosen for the reasons of convenient design and easy system 
considerations. The shifted 2-bit data is added or subtracted in the "2-bit 
adder/subtractor" C. Their output is stored in the shift registers D which split the least 
significant bit and most significant bit (binary weight q = 0 and q = 1) into two output 
lines. 

Next comes the question as to where the output data of the adders and subtractors 
should go to address the ROM. How should the values in the ROM be arranged? It is 
shown in the above expanded equations that all the adder outputs which is designated as 
(U,(0:3) and U,(0:3) (Refer to Fig. 2). They are the 4 bits patterns which are the sum 
of the two adjacent bit Y,,”. q = 0 represents LSB bit and q = 1 represents MSB bit in 
Eqs (20) and (21). (U,(0:3) and U,(0:3) should be multiplied by the coefficients C,,, 
where k = 0, 2, 4, 6. All the two adjacent difference output V,(3:0) and V,(3:0) should 
be multiplied by the coefficients C,,, where k = 1, 3, 5, 7. Asa result, the four adders 


and subtractors output bit patterns form a 4-bit address to access the corresponding 
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accumulated sum of the coefficients C,,, k = 0, 1,... 7 which are stored in ROM E. 
This step will accomplish the 1-D DCT coefficient multiplication. The output of the 
ROM is first latched in register F, and then adder/subtractor G will calculate the sum of 
the "2-bit" spectral coefficient values according to Eq (21). The LSB (q = 0) values are 
shifted to the right one position and added to the q = 1 values. This addition will 
continue until the last bit pattern (12") of the incoming column data. According to Eq. 
(19), the incoming data have been represented in 2’s complement notation, so the most 
Significant bit’s value should be subtracted from all the previous summations. This is 
done by changing the add/sub control line of G into subtraction at the clock cycle of the 
last bit pattern for each column of data. 

The 2-bit sum or difference results of G are stored into register H and then sent to 
the accumulator I and J. The accumulator consists of one "16-bit adder" and a "shift 
night 2-bit register". The value stored in ROM E is a 16-bit word. The 16-bit adder I 
adds the previous 2-bit right shifted value (output of J) to the incoming value (output of 
H). The resulting value then is output to J register to do the 2-bit nght shift. This process 
will accomplish the computation of Eq. (21) as index q varies from 0 to p-1 in 2 bit 
increments. One thing has to be noted with caution; the initial value in the shift mght 2- 
bit registers for every incoming column of data should be zero. Otherwise, the previous 
column values would accumulate. To avoid this, just clear the shift nght 2-bit register 
at the beginning of the accumulation of every column group. 

After 8 clock cycles, the accumulated values are parallel loaded into register K. 


Similar to register A but in the reverse direction, register K puts out the 1-D DCT 
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spectral coefficients column by column. These !|-D DCT coefficients are then transposed 
by the transpose RAM (TRAM) according to Eq.(17). The transpose RAM is described 
in the next section. After the transpose RAM, 1-D DCT coefficients are then input into 
again the same 1-D DCT architecture. The only difference now is that the registers A 


and B have to be expanded from 12 bits to 16 bits for the second transform. 


B. TRANSPOSE RAM ARCHITECTURE 

According to Eq. (17), the purpose of the "transpose RAM" is to change the 8 x 
8 1-D DCT coefficient block’s columns into rows; and rows into columns. The 
coefficient values are generated from the 1-D DCT architecture column by column. 
First, these values are put into a RAM while the transposed values are written. 
Therefore, the transpose RAM must have the capability of reading in the 1-D DCT 
values and writing out the transposed values in the same cycle. How can this be done? 

The coefficient values come out of the 1-D DCT architecture in serial order; the 
0, 1, 2,..., 7 coefficients of the first column of the 8 xX 8 block come in first and then 
the 0,1,... 7 coefficients of the second column and the third column and so on. This 
order is a long stream of coefficients 0,1,... 63 for each 8 X 8 image block. After 
storing them in the RAM, the coefficients must be read out in groups of 8 values in the 
order of 0, 8, 16,..., 56; 1, 9, 17,..., 57; 2, 10; 18,..., 98: 3, 11, 19)..., 39) aes 
20,..., 60; 5, 13, 21,..., 61; 6, 14, 22,..., 62; 7, 15, 23,..., 63 to achieve the transpose 
operation. In the same cycle, just after reading out the first block of transposed values, 


the coefficient values of the second block can be written into those locations. It is just 
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like reading block 1_0 (first 8 x 8 block position 0) and wniting block 2_0 (second 8 x 
8 block position 0), reading block 1_8 and writing block 2_1, reading block 1_16 and 
writing block 2_2, and so on. In order to achieve the transpose of the second block, the 
sequence for reading out block 2 must be in the order of 0, 1, 2,... 63. When reading 
out the coefficients of block 2, the third block coefficients are being wnitten into the same 
locations just after read out. The order is just like reading block 2_0 and wniting block 
3_0, reading block 2_1 and wniting block 3_1, reading block 2_2 and wniting block 3 _ 2, 
and so on. Notice the sequential order is 0, 1, 2,...63 first, and then 0, 8, 16,..., and 
then again in the sequential order of 0, 1, 2,...63, and so on. 

As shown before the structural architecture design is based on the principle of 
distributed arithmetic, and it is data-path oriented. The methodology to describe this 


architecture in VHDL and to simulate it on a computer are discussed in the next chapter. 
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IV. VHDL BEHAVIORAL DESCRIPTION OF THE 1-D DCT COMPONENT 


A. BLOCK DIAGRAM DESCRIPTION 
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Fig. 3 1-D DCT block diagram 








The block diagram of the 1-D DCT shown in Fig. 3 can be described in models 
using VHDL. The block diagram shown here includes a 1-D DCT system discussed in 
chapter III and the additional clock generators, delay lines, control line, package 1, and 
test bench. There are minor differences between this diagram and the architecture 


described in the previous chapter. What is taken into consideration when simulating this 
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system in VHDL is that a signal flow latency will occur. Therefore, a delay line is 
necessary to change the clock triggering time and solve this latency problem. 
Additionally, the architecture in the previous chapter does not make it clear when to 
control the add/sub register G and fulfill the calculation of summing 2’s complement 
values. It is shown here that the control line generating this control bit is triggered by 
the delayed clock. 

From the modeling point of view, it is rather complicated to build up a 16-bit adder 
in VHDL following the usual arithmetic logics. The easiest approach is to convert the 
16-bit binary coefficient values into integer numbers and then do the addition or 
subtraction in integers. After the integer addition or subtraction, the integers are simply 
converted back to binary values. This conversion task is accomplished by functions in 
package 1. A VHDL package 1s a collection of functions and procedures. Of course, 
some overflow/underflow situations are expected to occur during these conversions. One 
last thing to note in Figure 3 is that the test bench module controls all the signal flow, 


the input data, and the output data, and it also simulates the whole design. 


B. BI-TO-DI AND DI-TO-BI VHDL PACKAGE 
the package 1 in VHDL 1s shown below, 


package pack! is_ -- Package declaration 
procedure bi_to_in -- Procedure 1 changes 16 bits binary into integer 
(variable x : bit_vector(15 downto 0); 
variable y : out integer); 
procedure in_to_ bi --Procedure 2 changes integer into binary 
(variable m : in integer; 
variable n : out bit_vector(15 downto 0));end pack]; 
package body pack! is -- Package body declaration 
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procedure bi to in -- First procedure that changes bits to integer 
(variable x : bit_vector(15 downto 0); 
variable y : out integer) is 
variable sum : integer :=0; 
variable p : bit_vector(15 downto Q); 


begin 
p.= X; 
if p(15) = ’1’ then -- Change negative value to positive 


for 1 in O to 14 loop 
if pai) = °1’ then 
for i in 0 to 13 loop 
piit+l) := not paitl); 
end loop; exit; 
end if; 
end loop; 
for k in 0 to 14 loop -- Integer conversion 
if p(k) = ’1’ then 
sum := sum + 2**k; 


end if; 
end loop; 
y := -sum; -- Convert back to negative value 
else 

for | in 0 to 14 loop -- Positive value conversion 


if p(l) = ’1’ then 
sum := sum + 2**]; 
end if; 
end loop; 
y := sum; 
end if; 
end bi_to_in; -- end of procedure 1 
procedure in_to_bi -- Second procedure that changes integer to bits 
(variable m : in integer; 
variable n : out bit_vector(15 downto 0)) is 
variable temp_a : integer := 0; 
variable temp _b : integer := 0; 
variable w : bit_vector(15 downto Q); 
begin 
ifm < 0 then 
temp _a:= -m; -- Take the absolute value of negative values 
else 
temp_a:=m; 
end if; 
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for 1 in 14 downto 0 loop -- Binary conversion 
temp_b := temp_a/(2**1); 
temp_a:= temp_a rem (2**1); 
if (temp_b = 1) then 
wi) = 71’; 
else 
w(i) := ’0’; 
end if; 
end loop; 
ifm > O then 
w(15) := ’0’; -- Assign positive sign bit 
else 
w(15) := 71’; -- Assign negative sign bit 
for k in 0 to 14 loop 
if w(k) = ’1’ then 
for kin Oto 13 loop’ -- Invert negative bits to 2’s complement 
w(k+1) := not w(k+1); 
end loop; exit; 
end if; 
end loop; 
end if; 
if w(14)=’0’ and w(13)=’0’ and w(12)=’0’ and w(11)=’0’ 
and w(10)=’0’ and w(9)=’0’ and w(8)=’0’ and w(7)=’0’ 
and w(6)=’0’ and w(5)=’0’ and w(4)=’0’ and w(3)=’0’ 
and w(2)=’0’ and w(1)=’0’ and w(0)=’0’ 
then 
w(15) := °0’; -- Avoid negative zero 


seein. his -- end of procedure 2 
end pack1; -- end of procedure 
This VHDL package used in the simulation is basically similar to any other high- 
level language subroutine involving specific shared operations. The difference here is 
that it is possible to gather several different procedures or functions together in one 


package. The pack! here consists of two procedures -- bi_to_in and in_to_bi. Bi_to_in 


converts the 16-bit binary numbers (represented in 2’s complement notation) into positive 
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or negative integers. The in_to_bi procedure converts the positive or negative integers 
back to 2’s complement 16-bit binary numbers. Note that in the 2’s complement number 
system used here, there are only 16 bits including one sign bit. In overflow situations, 


the digits that overflow will be truncated. 


C. CLOCK GENERATOR MODULE (CLOCK GE) 
The block diagram of the "clock_ge" 1s shown in Figure 4. 
The interface connection (port map in 
VHDL) has also been shown. This tells 
how the circuit can be connected to the 
test bench. The VHDL source code of the 
clk.vhd is shown below, 


entity clock_ge is -- Entity 
-- declaration 
port(CLCK :inout bit); 
end clock_ge; 
architecture clk_ctl of clock_ge is -- Architecture declaration 
begin 
process(CLCK) -- Process declaration 
variable I : integer := 0; 
begin -- Process begin 
CLCK <= not CLCK after 5 ns; -- Switching clock generation 
I:=I+1; 
assert I < = 80 -- Assertion terminates the infinite process 
report "job done” 
severity Error; 
end process; -- End of process 
end clk ctl; -- End of architecture 


Fig. 4 clock_ge block diagram 


There is a sensitivity signal "CLCK” in the source code which provides the clock 


for all the circuits. The initial value of CLCK is "0." Its value is changed into "1" after 
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5 ns. Since a process in VHDL basically is an infinite loop, it is necessary to use an 
“assert” instruction to terminate the process. By changing a counter value "I", the job 


can be terminated appropriately after 80 iterations. 


D. PARALLEL SHIFT REGISTER MODEL (LOAD). 


LOAD 





Fig. 5 Serial load parallel shift register block diagram 
Figure 5 shows the detailed block diagram of the parallel shift register (LOAD). 
The source code in VHDL is shown below 


entity LOAD is 
port (AI : in bit_vector(15 downto 0); BO,B1,B2,B3,B4,B5,B6,B7 : 


out bit_vector(15 downto 0);CLK : in bit); 


end LOAD; 
architecture BEH of LOAD is 


ZT 


type shift is array (0 to 7) of bit_vector(15 downto 0); 
begin 
process 
variable A : shift; 
variable I,count : integer := 0; 
begin 
wait until CLK’event and CLK = ’1’; -- Clock controls the timing 
for count in 0 to 7 loop 
wait until CLK’event and CLK = ’1’; 
for I in QO to 6 loop -- Push input values down to correct position 


A(I) := AU +1); 
end loop; 
A(7) := AIT; 
if (count = 7) and (CLK’event and CLK=’1’) then -- Output data 
BO <= A(?); 
Bl <= A(6); 
B2 <= A(S5); 
B3 <= A(4); 
B4 <= A(3); 
BS <= A(2); 
B6 <= A(l); 
B7 <= A(0); 
end if; 
end loop; 
wait on AI,CLK; -- Process activated when sensitivity signal changes 
end process; 
end BEH; 


The input 16-bit data come from AI column by column. The speed of the input data 


is controlled by the test bench. Note that the first data that appears is the 8" pixel value 


of the first column. In other words, the sequential order of the incoming data is 7, 6, 


In this order, the data is pushed down into the correct position, and the 1-D 


DCT can be done correctly. After the 1-D DCT computation in Figure 3, the 


corresponding spectral coefficients will be put back in the correct order,i.e., 0, 1, 2,... 


7. "LOAD" module parallel outputs the data to the second circuit "SHIFT" after eight 


clock cycles (count = 7). After that, it processes another new column of data. 
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E. SHIFT-TWO-REGISTER MODEL (SHIFT). 
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Fig. 6 Shift two register block diagram 

The block diagram for SHIFT is shown in Figure 6. There is the second clock 
generator with three delay gates. Since the incoming pixel values pass through the 
parallel shift register (LOAD), and it causes a delay of one clock cycle, it is necessary 
to compensate for this latency by delaying the clock which triggers the shift-two-register 
(SHIFT). Another clock which runs twice as fast as ck has been used to trigger the 
original clock passing through the delay line. The VHDL source code of this faster clock 
is similar to the previously discussed clock generator except the switching period is 


twice as fast. The assertion time for termination is therefore twice as long. the delay line 


jo 


consists of shift registers. The VHDL source code of the DELAY and the shift register 
is as follows 


entity delay is 
port(a : bit;b : out bit;CLK : bit); --Normal clock coming in from port 


end delay; 
architecture beh of delay is 
begin 
process 
variable x : bit; 
begin 
wait until CLK’event and CLK = ’1’; -- Faster clock controls timing 
x := a; -- Shifting the incoming clock 
b <=*x; 
wait on CLK,a; 
end process; 
nd beh; 
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entity shift is 
port(b10,bil ,bi2,bi3 ,bi4 ,bi5,bi6,bi7 : in bit_vector(15 downto 0); 
bo0,bo1 ,bo2,bo03,b04,b05,b06,bo7 : out bit_vector(1 downto 0); 
CLK : in bit); -- Port declaration, eight input and output 
end shift; 
architecture beh of shift is 
begin 
process 
variable I : integer := 0; -- counter as well as index 
begin 
for r in 0 to 7 loop 
wait until CLK’event and CLK = ’1’; 
bo0(0) < = bi0(D); -- "q" = 0 binary weight 
bo0(1) < = bi0(I+1); -- "q" = 1 binary weight 
bol(0) <= bil(D; 
bol(1) <= bil([+1); 
bo2(0) < = bi2(I); 
bo2(1) < = bi2([+1); 
bo3(0) < = bi3(I); 
bo3(1) < = bi3(I+1); 
bo4(0) < = bi4(1); 
bo4(1) < = bi4(I+1); 
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bo5(0) < = bi5(I); 
boS(1) < = bid(I+1); 
bo6(0) < = bi6(1); 
bo6(1) < = bi6(I+ 1); 
bo7(0) < = bi7(D; 
bo7(1) <= bi7(I+1); 
I:= 1+ 2; -- increment of two 
end loop; 
I := Q; -- reset the counter for next column of data 
wait on CLK,bi0,bil ,bi2,b13,b14,bi5 ,bi6,bi7; -- wait for new data 
end process; 
end beh; 


The data are input to the shift register in 16-bit words and output in 2-bit words. 
Note that the counter "I" has been used as an index for each data word. Therefore, a 
reset (I := 0) 1s necessary after each column of words are done. Otherwise, the index 


would be running out of range, giving a run time error in the VHDL simulation. 


F. 2-BIT ADDER/SUBTRACTOR MODEL (ADDSUB) 
The 2-bit adder/subtractor module is shown in Figure 7. The "adsu" VHDL source 
code is shown in Appendix A. A simple flow 


wai chart in Figure 8 shows the behavior described 
Be! 





in VHDL. There are eight 2-bit words input 








od —1 00 * 2 |. od 
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eof— a2 2 cd into this circuit. It is necessary to do the 
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005 —| of b& | cod "serial" 2-bit addition or subtraction according 
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to the expanded Eqs. (30) and (31). Since the 
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Fig. 7 2-bit add/sub block diagram incoming data have been presented in 
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2’complement notation, 2’s complement 
addition or subtraction should be used. On 
the other hand, the 2-bit serial operation 
should consider carriers generated 


previously. In other words, the first 2-bit 





addition/subtraction might generate a 
Fig. 8 "“adsu" flow chart 

carrier. This carrier must carry on to the 

next 2-bit add/sub computation. The simplest way to solve this problem is using a 2-bit 
adder accompanied by a register handing the carrier bit for the next addition/subtraction. 
For the subtraction case, it is necessary to convert the subtrahend into 2’s complement 
notation and then use the same 2-bit adder to accomplish the computation. What has been 
done here is to convert the subtrahend into 1’s complement first and then add it to "1" 
at the very first subtraction. The incoming subtrahend is just converted into 1’s 
complement notation and the adder takes care of the "1" addition. In this way, the serial 
subtraction is accomplished. There are four 2-bit adders and four 2-bit subtractors in the 
source code. The "cr" bit sets the adder carry at the beginning to zero and the "st" bit 
sets the subtractor carry to " ". Later on, the adder/subtractor will take care of the carry 
by itself. For the convenience of notation, the incoming two 2-bit data and the carrier 
bit have been combined into a 5-bit word, and the addition is done in the 2-bit adder 
block. There will be more explanation as to how the 2-bit adder block is formed in the 


later discussion. 


a2 


G. SHIFT REGISTER 

MODEL (REG) 

The shift register block 
diagram is shown in Figure 9. 
Signal is input from port a and 
output to port b. The shift register 


model (REG) VHDL source code 





is shown below 
Fig. 9 shift register (reg) block diagram 
entity reg is 


port(a0,al ,a2,a3,a4,a5,a6,a7 : bit_vector(1 downto Q); -- input port 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(1 downto Q); -- output port 


CLK : bit); 
end reg; 
architecture beh of reg is 
begin 
process 
variable d0,d1,d2,d3,d4,d5,d6,d7 : bit_vector(1 downto 0); 
begin 
dO := a0; -- Substitute the input signal in a variable 
dl := al; 
d2 := a2; 
d3 := a3; 
d4 := a4; 
d5 := 5; 
d6 := a6; 
d7 := a7; 


wait until CLK’event and CLK = ’1’; -- Clock control 
bO < = dO; -- shift the variable to output signal 


bl <=dl1; 
b2 <= d2; 
b3 <= d3; 
b4 <= d4; 
b5 <= d5; 
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Doma. — do, 
b7 <= d7; 
wait on CLK; 
end process; 
end beh; 


This circuit is the simplest one. The only effect of this code is to use a signal 
assignment statement to simulate a signal buffer causing a latency period of one clock 
cycle. The “wait until CLK’event and CLK = ’1’;" statement activates the timing 
control. The “wait on CLK" statement activates the process’s operation whenever the 


clock changes its state. 


H. READ ONLY MEMORY MODEL (ROM) 

Figure 10 shows the read only memory block diagram . The VHDL source code 
is included in Appendix A. There are eight 2-bit words input to this block, and sixteen 
16 X 16 words corresponding to the 1-D DCT multiplication coefficients being read out. 
The outputs of four adders with binary weight q = 0’s and q = 1’s bits form two 4-bit 
address bus to access the corresponding ROM multiplication coefficients. The same 
situation happens for subtraction. There are sixteen individual ROM locations with 
sixteen different values stored in them. Why there are sixteen ROM locations, and why 
there are sixteen different values stored in them are discussed in detail in later sections. 
Note that in the address assignment part of the source code, the order of the addresses 
starts from e0, el, e2, e3 and ends with e7, e6, eS, e4. This detailed explanation will 


also be given in later discussion. The values stored in the individual ROM have been 
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converted from the sum of coefficients "C,," to 16-bit 2’s complement binary values. 


rom 





Fig. 10 ROM block diagram 
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The values of "C,," are calculated according to Eq. (15) and Eq. (16). 


I. SHIFT RIGHT 1-BIT REGISTER MODEL (SHI 1) 


Figure 11 shows the shift right 1-bit register block diagram. Its VHDL source code 
is included in Appendix A. The shift right 1-bit register receives sixteen 16-bit words and 
makes the mght shift operation in eight words. It outputs the resultant sixteen 16-bit 
words to the next circuit. The only difference between the input and the output values 
is that the odd numbered 16-bit words have been shifted mght 1 bit position. At the same 


time, the original 16" bit (sign bit) of each odd word has been checked and replaced by 
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shi_1 





ck 
Fig. 11 Shi_1 register block diagram 


a proper bit ("O" or "1", depending on weather it has a positive or negative value) to 


properly extend the binary 2’s complement number. 


J. ADDER/SUBTRACTOR-G MODEL (ADD G) 


Figure 12 shows the add_g block diagram. It includes one control circuit and five 


delay gates. The control circuit enables the add_g to do addition or subtraction. The 


purpose of the delay line is to compensate for signal latency. To activate the add/subtract 


controller at the night time when signal arrives is a required procedure. 
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Fig. 12 Add_g block diagram 
The add_g VHDL source code as well as the control and the delay VHDL source 
code are shown below. 
entity control is 


port(CLK : bit;ct : out bit); 
end control; 


architecture beh of control is -- control 
begin 
process 
variable i : integer := 0; 
begin 
wait until CLK’event and CLK =’1’; -- Clock triggers the circuit 
if i = 7 then 
ct <= ’1’; -- output ’1’ every eight clock period 
else 
et <a 0: 
end if; 
1:=i1+1; 
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if i = 8 then 
i := 0; -- Reset the counter 
end if; 
end process; 
end beh; 
entity delay10 is 
port(a : bit;b : out bit;CLK : bit); 
end delay 10; 
architecture beh of delay 10 is -- delay 
begin 
process 
variable x : bit; 
begin 
wait until CLK’event and CLK = ’1’; 
Xx:=a; 
b<=x; 
wait on CLK,a; 
end process; 
end beh; 
use work. pack1.all; -- All the functions in pack1 are used 
entity add_g is 
port(al ,a2,a3,a4,a5,a6,a7,a8,a9,al0,al1,al2,a13,al4,al5,al6: 
bit_vector(15 downto 0); -- input port 
b1,b2,b3,b4,b5,b6,b7,b8 : out bit_ vector(15 downto 0); -- output port 
CLK,as : bit); 
end add _ g; 
architecture beh of add _g is 
begin 
process 
variable x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16, 
n1,n2,n3,n4,n5,n6,n7,n8 : bit_vector(15 downto 0); 
variable y1,y2,y3,y4,y5,y6,y7,y8,y9,y10,yll,y12,y13,y14,y15,y16, 
m1,m2,m3,m4,m5,m6,m7,m8: integer := 0; 
begin 
wait until CLK’event and CLK = °’1’; 
xXl:=al; x2:= a2; x3:= a3; x4 := a4; -- input values 
x5 := a5: "XO $= 4637 x7 2 = 87°" XO = do: 
x9 := a9; xl0:=al10; xll:= all; x12 := 12; 
x13 :=al3; xl4:=a14; x15 :=al15; xl6:= 16; 
-- Procedure call to do integer conversion 
bi_to_in(x1l,y1);bi_to_in(x2,y2);bi_to_in(x3,y3);bi_to_in(x4,y4); 
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bi_to_in(x5,y5);bi_to_in(x6,y6);bi_to_in(x7,y7);bi_to_in(x8,y8); 

bi_to_in(x9,y9);bi_to_in(x10,y10);bi_to_in(x11l,y11);bi_to_in(x12,y12); 

bi_to_in(x13,y13);bi_to_in(x14,y14);bi_ to _in(x15,y15);bi_to_in(x16,y16); 

ifas = ’0’ then 

mi := yl + y2; m2 := y3 + y4; m3 := yS + y6; m4 := y7 + y8; 

m5 := y9 + yl0; m6 := yll + yl2; m7 := yl3 + yl4; m8 := yl5 + 
y 16; 

else -- Control gives the subtraction instruction 

ml := yl - y2; m2 := y3 - y4; m3 := yS - y6; m4 := y7 - y8; 

m5 := y9 - yl0; m6 := yll - yl2; m7 := yl3 - yl4; m8 := yl15 - yl16; 


-- Procedure call to do binary conversion 
in_to_bi(m1,n1); in_to_bi(m2,n2); in_to_bi(m3,n3); in_to_bi(m4,n4); 
in_to_bi(m5,n5); in_to_bi(m6,n6); in_to_bi(m7,n7); in_to_bi(m8,n8); 
bl <=nl; b2 <=n2; b3 <=n3; D4 <= n4; 
b5 <= n5; b6 <= n6; b7 <=n7; b8 <= n8: 
wait on al,a2,a3,a4,a5,a6,a7,a8,a9,al0,al1,al2,a13,a14,a15,al6,CLK; 
end process; 

end beh; 

The control is triggered by the clock, and an output of the control bit "ct" is 
generated. On the 8" clock period, the "ct" becomes "1" but equals "0" otherwise. The 
delay is also triggered by the clock. It receives one bit and outputs the same bit one clock 
cycle later. 

Add _g has sixteen 16-bit word inputs and eight 16-bit word outputs. It performs 
16-bit addition or subtraction. As discussed previously, it is rather complicated to build 
up a 16-bit adder/subtractor in a VHDL structural approach. The e*siest way is to 
convert the 16-bit binary words into integers. In this way, "use work.packl.all" at the 
beginning of the entity has to be declared, in order to call the "bi_to_in" procedure in 
packl. "Work" represents the working library used, and “packl.all" represents all the 


packages being used. After the conversion of binary values to integer values, addition or 


subtraction was done according to the control input "as". The results then are converted 
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back to binary values again for output. Of course, the timing is always synchronized by 


the clock. 


K. SHIFT REGISTER-H MODEL 
(REG_H) 
The reg h block diagram is 


shown in Figure 13. It functions just 


SYUISHRBKRA 


like "reg", except "reg" handles 2-bit 





words and "reg h" handles 16-bit 
Fig. 13 Shift register_g block diagram 
words. The VHDL source codes are 


the same except for the declaration of the length of bit-vectors. 


L. 16-BIT ADDER_I MODEL (ADD _J 
Figure 14 shows the block diagram of the 16 bit adder (ADD_I). ADD_I and 

ADD_G are basically the same. ADD_I does not have the "as" control bit or "if" 
instruction in the VHDL source code to do the subtraction. Another big difference is 
that ADD _[ is not triggered by the clock. It adds up the two 16-bit inputs with no delay. 
It does integer addition with the procedures in packl also. The two inputs come from 
REG_H and the feedback output from the SHI_2, which shifts the result to the nght by 
2 bits. This is shown in Figure 2. The VHDL source code for ADD_I is shown below 

use work. pack1.all; 

entity add_i is 

port(al,a2,a3,a4,a5 ,a6,a7,a8,a9,al0,al1,a12,a13,al4,al5,al6: 


bit_vector(15 downto 0); 
b1,b2,b3,b4,b5,b6,b7,b8 : out bit_vector(15 downto 0)); 
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Fig. 14 16-bit add_i block diagram 


end add _ 1; 
architecture beh of add_i is 
begin 
process 
variable x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16, 
n1,n2,n3,n4,n5,n6,n7,n8 : bit_vector(15 downto 0); 
variable yl, y2,y3,y4,y5,y6,y7, y8,y9,y10,yl1l,y12, y13,y14,y15,y16, 
ml1,m2,m3,m4,m5,m6,m7,m8: integer := 0; 
begin 
x1: 


II 


al; x2:= a2; x3:= a3; x4:= a4; 

x5 := a5; x6:= a6; x7:= a7; x8 := a8; 

x9 := a9; xl0:=al10; xll := all; x12 := 12; 

x13 := al3; x14:=al14; xl1S:=al15; xl6:= al6; 
bi_to_in(xl,y1);bi_to_in(x2,y2);bi_to_in(x3, y3);bi_to_in(x4,y4); 
bi_to_in(x5,y5);bi_to_in(x6,y6);bi_to_in(x7,y7);bi_to_in(x8, y8); 
bi_to_in(x9,y9);bi_to_in(x10,y10);bi_to_in(x1ll,y11); 
bi_to_in(x12,y12); 
bi_to_in(x13,y13);bi_to_in(x14,y14);bi_to_in(x15,y15); 
bi_to_in(x16,y16); 
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pel & 
m5 : 


= yl + y2; m2 := y3 + y4; m3 := y5 + y6; m4 := y7 + y8; 
y9 + yl0; m6 := yll + yl2; m7 := y13 + yi4; m3 :=9i5 
ap ST lee 
in_to_bi(m1,n1); in_to_bi(m2,n2); in_to_bi(m3,n3); in_to_bi(m4,n4); 
in_to_bi(m5,n5); in_to_bi(m6,n6); in_to_bi(m7,n7); in_to_bi(m8,n8); 
bl <=nl; b2 <= n2; b3 <= n3; b4 <= n4; 
bS <= n5; b6 <= n6; b7 <=n7; b8 <= n8; 
wait on al ,a2,a3,a4,a5,a6,a7,a8,a9,a10,al1,al2,a13,a14,a15,al16; 
end process; 
end beh; 


M. SHIFT RIGHT 2-BIT REGISTER MODEL (SHI 2) 
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Fig. 15 Shift right 2-bit register block diagram 
The shift nght 2-bit register (shi_2) block diagram is shown in Figure 15. It 
includes another clock generator running two-times faster to trigger the delay unit which 


delays the normal clock by one period. It has another clear line (clr) from the test bench 
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that clears the register every eight clock cycles. The VHDL source code of SHI 2 1s 
shown in Appendix B. 

The SHI_2 model has eight 16-bit word inputs from ADD _I and has sixteen 16-bit 
word outputs. The input values have been checked for the sign bit, and the SHI 2 shifts 
the data 2 bits to the mght in proper 2’s complement representation. There are eight 
blocks in the SHI_2 module. The results are updated and fed back to ADD_I module to 
perform an addition with the incoming data values. In every 8" clock cycle, the results 
are parallel shifted to the "parallel load serial shift" register (RESULT). During the same 
cycle, the shift mght 2-bit results are cleared, and the SHI_2 is ready for the next column 


operation. 


N. PARALLEL LOAD SERIAL SHIFT REGISTER MODEL (RESULT) 

The block diagram of the parallel load serial shift register (RESULT) is shown in 
Figure 16. There are eight inputs from SHI_2; RESULT puts out only one value at a 
time. The VHDL source code of RESULT is shown below, 


entity result is 
port(al ,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
k : out bit_vector(15 downto 0);CLK : bit); 
end result; 
architecture beh of result is 
type r is array (0 to 7) of bit_vector(15 downto Q); 


begin 
process 
variable x : 1; 
begin 
x(O) := al; x(1) := a2; x(2) := a3; x(3) := a4; 
x(4) := a5; x(5) := a6; x(6) := a7; x(7) := a8; 


for i in 0 to 7 loop 
wait until CLK’event and CLK = ’1’; 
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Fig. 16 Parallel shift serial output register block diagram 
k <= x(i); 
end loop; 
wait on al,a2,a3,a4,a5,a6,a7,a8,CLK; 


end process; 
end beh; 


Eight 16-bit words are input into RESULT every 8" clock cycle. They are pushed 


Out one value at a time at every clock period. After all eight values have been output, 


new values are fed in again for the next cycle. 


de 


O. TEST BENCH 

The Test bench block diagram is 
shown in Figure 17. It actually 
includes all the intermediate signals, 
the control signals, and the input and 
output signals. The VHDL source code 
shown in 


for the test bench is 


Appendix B. All the components used 


in the system have been declared and Fig. 17 





TEST BENCH 
di cdr set cr P 


DESIGN CIRCUIT 


Block diagram of Test Bench 


instantiated. The signals used for the simulation are declared also. Configuration 


Statement binds all the components to the test system. The input pixel values are fed into 


the system through "di", and it is simulated. The results of the simulation are collected 


by signal "p". A table of the simulation results 


the design is functioning correctly. 
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p" is generated and analyzed to see if 


V. SIMULATION OUTPUT ANALYSIS AND EXPERIENCE 


A. FORMATION OF ROM STORAGE VALUES 

As discussed before, there are only sixteen-word ROM for each multiplication 
coefficient due to the symmetry in DCT. The coefficients can be calculated according to 
Eq. (15) and Eq. (16). 


Table I: Multiplication Coefficients 


a = a28 SS 


= 0 


Cia» kK = odd 
B= hirls C = Yorks 





0975451610- | .2777851165 | .4157348061 | -.4903926402 


Since N = 8, the expanded equation of Eq. (30) and Eq. (31) can be derived as in Table 


I after substituting the proper index (m, k). The labels UO, U2, ..., V7 are included in 
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the table for better understanding. Labels A, B, C, D stand for bit patterns. For example, 
if A = 1, B = 0, C = 1, D = 1, then the values in column 1, 3, and 4 should be 
summed up to get the corresponding multiplication coefficient sum stored in the ROM. 
The bit pattern in the circuit has two weighted groups (LSB group q = 0’s, and MSB 
group q = l’s). The coefficient values for these two patterns are exactly the same. 
Therefore, there are only 8 X 16 = 128 different coefficient sums stored in ROM. 
One very important fact must be stressed. Are the values stored in the ROM 
decimal numbers? The answer is obviously no. The values are stored in the ROM as 
binary numbers. How can these summed decimal numbers be converted into binary 
numbers? Upon inspection of Table I, it is noted that the largest possible decimal 
number generated is not greater than 2. The smallest possible decimal number generated 
is not lesser than -2. As stated before, the number system used here is 16-bit 2’s 
complement number. Therefore, one sign bit, one digit bit, and fourteen fraction bits are 
chosen to represent the binary numbers stored in the ROM. All the decimal coefficients 
calculated according to the specific bit pattern A, B, C, D have to be converted into 
binary 2’s complement 16-bit numbers. This conversion operation is carried out with the 
help of a small program written in Matlab listed in Appendix C. The actual values stored 


in the ROM are shown in the ROM VHDL source code. 


B. SIMULATION AND TESTING IMAGE PATTERN (D) 
The first image pattern being used is shown in Figure 17. It is a two-dimensional 


cosine wave with intensity varied along x-axis. The pixel value can be represented in 128 
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123 4 56 7 


Fig. 18 Pattern (ID 8 xX 8 image block 


levels. Therefore, the pixel value of each point in this image can be represented from the 


following formula 


f (x, y) = [cos(2nfx + 2nfy) + 1] /2 x 128 (32) 


where f, = 1/4, f, = 0. 
After substituting the corresponding index (x, y) in Figure 17 into Eq. (32), the pixel 
values represented in this 8 < 8 image block can be shown in Table II. The 12-bit binary 
representations of decimal numbers 128 and 64 are "000010000000" and 


"000001000000". Converting the values in Table II into 12-bit binary numbers and taking 
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Table II: 8 <x 8 image pixel values of Pattern (1 





them column by column into the 1-D DCT VHDL model yields the corresponding 1-D 
DCT spectral 
coefficients (in Hex) as listed in Table III. The same decimal values in Table II has also 
been put into a 1-D DCT subroutine for calculation which is in a image processing 
library called spider. The result is shown in Table IV. 

Due to the time limitations, the attempt to carry out the transpose of the 1-D DCT 
coefficients in VHDL behavior models was not made. However, manual transpose is 
done instead. Transposed 1-D DCT coefficients of pattern (I) in VHDL simulation is 


shown in Table V. The values in Table V are converted again into binary numbers and 
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Table II: 1-D DCT spectral coefficients of Pattern (I) in VHDL simulation 













Table IV: 1-D DCT coefficients of pattern (I) using Spider Subroutine 


| 362.03 | 181.01 dalek 181.0 | 362.03 | 181.01 sialic o 


a ee a ee 
0 0 0 0 0 0 0 0 | 





input column by column into the 16-bit |-D DCT VHDL model to accomplish the 2-D 
DCT operation. The 2-D DCT spectral coefficients which have been transposed back in 
the VHDL simulation are shown in Table VI. The 1-D DCT operations in the VHDL 


simulation is based on integer calculation. In order to prove that the 1-D DCT VHDL 
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Table V: Transposed 1-D DCT coefficients of pattern (I) in VHDL simulation 





Table VI: 2-D DCT spectral coefficients of pattern (I) in VHDL simulation 


| o1F7 | os | coo | oocs | corr | reve | 0000 | FrED | 
0000 | 0000 | 0000 | 000 | 0000 | coo | c000 | 0000 


0000 | 0000 0000 0000 0000 0000 0000 0000 
: IE ae 


simulation result is correct, the values in Table V are converted into integers and are 













shown in Table VII. The values in Table VII is again calculated column by column using 
the spider 1-D DCT subroutine. Its 2-D DCT spectral coefficients are transposed and 


shown in Table VIII. 
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Table VII: Table V in integer values 





To ensure that the 1-D DCT structural calculation in the VHDL simulation is 
correct , direct 1-D DCT calculation on a calculator is also carried out based on 
Eq.(15), and Eq.(16). Equations (33) and (34) show the calculation example for k = 0 


and k = l. 
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C(O) = lela +0+1448 +2896 +1448 +0+1448) (33) 
g 


C(1)= = (2896cos = 144808 +0+1448cos_= 
16 16 16 





+2896cos 2 +1448cos BLE +Q+ 1448cos 13% (34) 
16 16 16 


The results using this approach are listed in Table IX. Note that the results of Table VIII 
and Table IX are very close. 


Table IX: 2-D DCT coefficients of pattern (I) using direct calculation 





It is also necessary to trace the operation in the VHDL structural models shown in 
Figure 2. To understand the structural operation and calculation of the 1-D DCT in the 


VHDL simulation in more detail, a manual derivation and calculation are carried out for 


3) 


Table X 16-bit binary number representation of table (V) 





the purpose. First the values in Table V need to be converted into binary numbers, which 
are shown in Table X. It is clear that only one column of Table X is not zero. Therefore, 
there is only one column of the 1-D DCT that needs computation. The values in the first 
column are input into the 1-D DCT VHDL model which yields the serial 2-bit 
addition/subtraction results as shown in Table XI. 

The first column in Table XI shows how the 2-bit addition/subtraction is done. The first 
row on the top represents the clock cycle. The rows in the upper-half (U) correspond 
"k" equal to even numbers, and the rows in the lower-half (V) correspond "k" equal to 
odd numbers. Each half column has four bits, forming a bus to address the corresponding 
ROM coefficients. For example, at the first clock cycle, there are two 4-bit buses. The 
four least significant bits (LSB) form an "ABCD" corresponding to "0000" bus to address 
the "U0O" (refer to Fig. 2) ROM value. This yields the value "0000000000000000" as 


output. The MSBs of the first clock cycle addresses the "U01" ROM _ value 
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Table XI: Serial 2-bit addition/subtraction output 


eee a 
a a TED 
mcr) M1) 0 | 105) 10,| 00 [op | 8) 
siowtae{ © | 00 | ox | or | 10 | w | 0 | oo | o | 
9) 001 | 00 ft) 18 | 1 | 0 oo] 0 
SPR 
SRD cs 
sic) co | oo | vo | ot | vo [| 10 | oo | 8 | 
~ LS DBE 


"0000000000000000" out. It then adds up with the 1-bit mght shifted value of "UQO". 














This result is stored in REG_H and then 2-bit nght-shifted in the SHI_2 register. The 
first clocked 2-bit right-shifted word is then fed back to ADD_I and added to the second 
clocked result "0101101010000010". The procedure of getting this second clocked result 
is just the same as that of getting the first clocked result. The summation of the first 2-bit 
right-shifted number and the second clocked result "010110101000010" is then shifted 
right 2 bits, yielding "0001011010100000". This value is then added to the third clocked 
result "0111000100100010", yielding "1000011111000010". This process goes on 


serially until the 8" clock cycle is reached. The addressed output ROM value of the MSB 


of the 8" clock cycle "0000000000000000" is subtracted from the right-shifted 1-bit 
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Table XII 2-D DCT coefficients of pattern (I) using manual calculation 











addressed ROM value of the LSB of the 8" clock cycle "0000000000000000". This result 
is then added to the previous accumulated 7 clocked values, yielding 
"0000011111111111". This final result is then mght shifted 2 bits, yielding 
"0000000111111111" and output as the first pixel 2-D DCT coefficient of the first 
column. 8 X 8 image block of the 2-D DCT coefficients pattern I using structural 
manual calculations are shown in Table XII. The detailed calculation procedure is listed 
in Appendix D. Note that the summation of the accumulated two clocked values and the 
third clocked result generates an overflow. This overflow will eventually generate a 
negative value when right-shifted 2 bits. This is a inherent drawback of using 16-bit 


integers arithmetics. 
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C. SIMULATION AND TEST OF IMAGE PATTERN (ID 
Image pattern II is equal to image pattern I rotated by 45°. The following formula 


was used to calculate each pixel value. 


. lin. onclyny + 
if Gy) = [cos(2m (7) Tx 2n(7)D) 1] / 2 x 128 (35) 


Table XIII: 8 x 8 image block pixel values of pattern (ID 





The 8 X 8 image block pixel values of pattern II represented in decimal numbers are 
shown in Table XIII. The 2-D DCT of pattern II has been calculated in two ways, 
VHDL simulation and spider subroutine. Using VHDL simulation first, Table XIII is 
converted into binary numbers and is input column by column into the VHDL 1-D DCT 


test bench. Its 1-D DCT coefficients is shown in Table XIV. For 2-D DCT, the values 


Su 


Table XIV: 1-D DCT coefficients of pattern (11) using VHDL simulation 
| 0000 | 0000 | 0000_ 


| 0000 | 000 | ooo | oo | ooo | 0000 | 0000 | 0000 | 

| 4 | rr7 | oo8a | ooga | rr7 | Fr74 | 0088 | 008A | FF74_ 

| 3 | cops | cops | Frsa | rréa | cops | oops | FFés | FRSA 

Babar ea es 
0000 0000 0000 0000 0000 | 0000 | 0000 | 0000 

asco oc 

fo Pe oe | Ts ae 








ee 


fale |> 







in Table XIV are then transposed manually, and the results are input into the 16-bit 
VHDL 1-D DCT test bench. The 2-D DCT spectral coefficients for pattern II in VHDL 
simulation are listed in Table XV. 


Table XV: 2-D DCT coefficients of pattern (II) using VHDL simulation 


ee 
| 2 | rrr | ooo | 000 | 0000 | 0020 | o000 | 0000 | 0000 _ 


l 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 | 0000 





58 


Table XVI: Pattern II 1-D DCT coefficients using Spider Subroutine 


ethos, hee 


seo.97 |e 53.97 Sono 53097 Ne 33 oe 
= 97 BB) 97 B83. 97 


a 
-69.53 | 69.53 on 53 69.53 | 69.53 
69.53 | 69.53 69.53 
90.51 | 90.51 | -90.51 - | 90.51 | 90.51 | -90.51 : 
90.51 90.51 
46.46 | -46.46 | -46.46 | 46.46 | 46.46 -46.46 | 46.46 
46. = 


be Tey oe Th) ae 757 — Tey — ao a 757 -2 iby) 
6. 157 








1-D DCT subroutine in Spider is used to double check the VHDL simulation 
result. Values in Table XIII are calculated column by column, and its result is listed in 
Table XVI This result is compared with that of Table XIV for verification. 

2-D DCT floating point calculation is also used to check the VHDL simulation. 
Again for the same reason of comparison, values in Table XIV are chosen and converted 
into integers. After the Hex-integer conversion, these values are transposed again and 
calculated by 1-D DCT Spider subroutine column by column. The results are shown in 


Table XVII. 
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Table XVII: 2-D DCT coefficients of pattern (II) using floating point calculation 


17 | 1023.9 |) 0 [20 |e 08 | 500 [eons Om enon 
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pat o [onl oy 0 ros | ore ee 
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D. RESULT ANALYSIS 

















There are four methods being used to prove the accuracy of the VHDL structural 
1-D DCT in VHDL simulation. Comparing Tables VI, VIII, IX, and XII, the 
similarities among them are obvious. Tables VIII and IX are almost the same while 
Tables VI and XII need to be converted into decimal numbers for ease of comparison. 
Table VI needs to be converted into 16-bit binary values first, then using the definition 
of the 16-bit binary number system (1 sign bit, 1 integer and 14 fraction bits) to convert 
the binary words into decimal numbers. 

The multiplication factor as to how many times the number is being right-shifted 
here is 2'’, The equivalent integer values of Table VI and Table XII are shown in Table 
XVIII and XIX. Most of the pixel values are similar to those in Table VIII and IX with 
a few differences. There are two reasons that can explain this phenomenon. First, there 


is a limitation in 16-bit binary number representation. Those fractional numbers that are 





Table XVIII Equivalent decimal numbers of table (VI) 


Toe [0 [0 [1s [200 |e] 0 | 
To fo fofolololo 





smaller than 27'* are truncated. This will cause small difference between Table VI, XII 
and Table XVIII,XIX. The second reason is due to the overflow situation. The 
accumulated sum of the coefficients might be greater than the biggest number that a 16- 
bit binary number system could represent. This overflow situation will cause larger 


difference between Table VI,XII and Table XVIII, XIX. 
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A way is found to indicate the overflow situation. Checking can be made in 
ADD_G and ADD I by adding the following VHDL source code right after the integer 
to binary number conversion. 

if ((x,(15) = °1’ and x,(15) = ’1’ and n,(15) = ’0’) or 

(x,(15) = ’0’ and x,(15) = ’0’ and n,(15) = °1’)) then 
overl <= ’1’; 
if ((x,(15) = °1’ and x,(1$) = ’1’ and n,(15) = ’0’) or 
(x,(15) = ’0’ and x,(15) = ’0’ and n,(15) = ’1’)) then 
over2 <= 1’; 
if ((x,,(15) = °1’ and x,,(15) = ’1’ and n,(15) = ’0’) or 
(x,<(15) = ’0’ and x,,(15) = ’0’ and n,(15) = ’1’)) then 
over8 <= °1’; 
Of course, at the port declaration, a special signal declaration must be made in order to 
notify the test bench about this overflow condition. VHDL source code for the port 
declaration is shown below. 
port(--;b1 ,b2,b3,b,b5,b6,b7,b8 : out bit_vector(15 downto 0); 
overl over2,over3,over4,over5,over6,over7,over8 : out bit; 
CLK : bit); 
Addition to the port modification, the test bench component’s port also needs to be 


modified. the last thing to accomplish in signaling this overflow condition is to declare 
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signals and unable the “port map" to receive the overflow signal coming from ADD G 


and ADD _I. VHDL source code is shown below. 


g: add g port map(f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16, 
21,2¢2,23,24,25,26,27,28,ck,qo,ovl,ov2,0v3,0v4,0V5,0V6,v07,v08); 
i: add i port map(hl,rl1,h2,r2,h3,r3,h4,r4,r4,h5,r5,h6,r6,h7,r7,h8,r8, 
il ,i2,13,14,15,16,17,18 ,ck,ov1,0v2,0v3,0V4,0v5,0V6,0V7,0v8); 
Whenever the overflow bit "ov#" changes to ’1’, it indicates that particular pixel value 


has experienced overflow. 


E. EXPERIENCE 


My experience in the work can be listed as follows. 


1. Input Data Sequential Order error 
The sequential order of input pixels which are input to the parallel shift 
register was assumed to be 7, 6, ...0. According to the transposed sequence, the actual 
input data should be in the order of 0, 1, 2,...7. Therefore, there would be an error if 
the sequence of the transposed data is not reversed. This means that another reverse 


circuit should be added between the transpose circuit and the input "load" circuit. But, 


63 


it is rather complicated to add an extra circuit. The easiest way to solve this problem is 
to input the data in the order of 0, 1,...7 and switch the subtrahend connections (0-7, 1- 
6, 2-5, 3-4) in the 2-bit adder/subtractor circuit. In this way, the order of input data and 
output data are always in the order of 0, 1, 2,... 7 and it is not necessary to add an extra 


circuit. 


2. Formation of 2-bit Adder in VHDL source code 
The interface of a 2-bit adder has five inputs (two for the adder, two for the 
addend, and one for the carrier), three outputs (two for the addition result, and one for 
the carrier). Thus, a truth table involving all possible input combinations can be made. 
There are five inputs, therefore 2° = 32 combinations will occur. After building up an 
8 X 32 truth table, Karnaugh map reduction can be used to minimize the complex 
expression in boolean algebra. It is the boolean algebra expression which is used in the 


VHDL source code. There is a detailed example listed in Appendix E. 


3. No Timing control in Add_i Model 
Almost every circuit needs a clock to trigger and control the sequential 
process. ADD _I is a special adder circuit without a triggering clock. As mentioned 
earlier, the accumulator of the serial bit result consists of ADD_I and SHI_2. ADD _I is 
used to add up the incoming clocked result with the latest accumulated result nght after 
right-shifting by 2 bits. If these two circuits are triggered by the clock, then there will 
be a time delay of one clock cycle between ADD _I and SHI_2. In other words, ADD_I 


is adding the incoming clocked result with the accumulated right-shifted 2-bit result from 


—_—_ 


one clock cycle earlier, rather than the latest. This will cause an error in the output 
coefficients. The method to remove of this time delay of one clock cycle between ADD _I 
and SHI 2 is to allow only one clock to trigger this accumulator. Another alternative 
considered is to use the clock to trigger ADD_I instead of triggering SHI_2. However, 
the experiment shows that this cannot be done, since SHI 2 has to be cleared on every 
8" clock cycle, and this clearing needs a counter to calculate the exact time. On the other 
hand, SHI_2 is to output the correct accumulated result every 8" clock period. These two 
factors both need a clock to control the timing. This is why ADD _I was chosen not to 


be triggered by the clock. 


4. "Set" control in Test Bench 

It is strange enough that the "set" control in the test bench does not get the 
value *1’ at the beginning of simulation. The function of "set" is to initiate all the 
subtractor’s carriers in “adsu" to ’1’ in order to accomplish the subtraction. This 
initiation is performed only once. The carrier of the subtractor is then carried over all 
by itself. That is to say, the carrier is a variable in "adsu". This carry variable is initiated 
by the "set" first and will be influenced by the "set" at subsequent times if modification 
of the signal "set" is not made. Fortunately, “set” has to change only once from ’0’ to 
’1’ at the beginning of the simulation. Therefore, an "event" instruction causes "set" to 
be a sensitivity signal. Since “set" changes only once, it will not have any further 
influence on the carrier variable. Other than this, the time for "set" to change its state 
is very important. the clock is ’0’ at the beginning of the simulation and changes its state 


to ’l’ after 5 ns. If "set" changes its state other than at 5 ns, the subtraction result will 
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be wrong. Only when "set" changes its state at 5 ns will the result of subtraction be 


correct. 


5. Signals cannot be used as variables in VHDL 
In solving the problem mentioned in previous section, efforts have been made 
to use the "set" signal directly as a variable within the process. This certainly will yield 


a syntax efror doing compilation of the source codes. 


6. Preventing Negative Zero occurrences in Packl 

There is a paragraph of source code added to packl at the end of “in_to_bi" 
when negative zeros found during the simulation. When these negative zeros arrive at the 
gate of shi_2, they will generate very large negative numbers and cause an error at the 
Output. This unwanted situation has been taken care of by adding source code to check 
for negative zeros at the end of the integer-binary conversion procedure. Although this 
extra checking source code works fine, it means an extra circuit must be added. This 
is not the goal in circuits design. A close inspection of in_to_bi source code has been 
made and a very small mistake has been found. At the beginning of inverting the bit 
stream into 2’s complement codes, positive or negative integers is checked in order to 
assign the correct sign bit "w(15)" for the converted binary number. It is found that 
"w(15) := ’0’" is only assigned to the situation when "m > ’0’". The other values are 
all assigned with "w(15) := °1’". This is how negative zeros are generated. Had the 
source code "m > ’0’" been changed to "m >= ’0’", the extra negative zero checking 


codes would not be necessary. 
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VI. CONCLUSION 

The main objectives of this thesis, using the VHDL to describe a 1-D DCT 
structural architecture of a 8 X 8 image block and simulating it on a workstation, have 
been reached. The basic theory of 1-D DCT, the principle of distributed arithmetic and 
the actual hardware architecture have been made more clear in the VHDL simulation. 
Above all, the experience of using the VHDL to describe an algorithm and the simulation 
of the VHDL is obtained. Although getting familiar with the language and its simulation 
has been time-consuming, the benefits of the signal tracing and the time modeling have 
been demonstrated in this thesis. VHDL itself 1s a portable document and a hierarchical 
language. Therefore, this thesis can be adopted in other more complicated design. 

Despite the fact that the VHDL simulation result of integer point calculation is not 
as precise as floating point calculation, the resultant energy spectrum of 1-D DCT is 
already good enough to recover the original image block. Besides, absolute value 
accuracy is not important for image compression. It is the relative value between pixel 
points that matters. Another point worthy to mention is that the approach in this thesis 
has the advantages of calculation speed, since the hardware for floating point calculation 
is much more complicated than that for integer point calculation. 

There is still a very important module that was not described, the transpose 
module. The transpose module can be connected to the test bench and fulfill the 


automatic 2-D DCT simulation. 
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The simulation done here is only the initial part of the "top-down design" process. 
The algorithm of an 8 X 8 image block 2-D DCT in VHDL behavior description was 
implemented. This behavior description can be further developed into gate level 
descriptions. Once reached the gate level, the hardware circuit implementation can be 


realized. 
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APPENDIX A. 12-BIT 1-D DCT VHDL SOURCE CODES 


-------------------- Normal clock generator ------------------------ 
entity clock _ge is 
port(CLCK :inout bit); 
end clock ge; 
architecture clk ctl of clock ge is 
begin 
process(CLCK) 
variable I : integer := 0; 
begin 
CLCK <= not CLCK after 5 ns; 
Il:=I+1; 
assert I <= 80 
report "job done" 
severity Error; 
end process; 
end clk _ ctl; 


entity LOAD is 
port (AI: in bit_vector(11 downto 0); B0,B1,B2,B3,B4,B5,B6,B7: out bit_vector(11 
downto 0);CLK : in bit); 
end LOAD; 
architecture BEH of LOAD is 
type shift is array(0 to 7)of bit_vector(11 downto 0); 
begin 
process 
variable A : shift; 
variable I,count ; integer := 0; 
begin 
wait until CLK’event and CLK = ’1’; 
for count in 0 to 7 loop 
wait until CLK’event and CLK = ’1’; 
for I in 0 to 6 loop 
A(D := A(I+1); 
end loop; 
A(7) := AT; 
if (count = 7) and (CLK’event and CLK=’1’) then 
BO <= A(7); 
Bl <= A(6); 
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B2 <= A(5); 


B3 <= A(4); 
B4 <= A(3); 
BS <= A(2); 
B6 <= A(I1); 
B7 <= A(0); 
end if; 
end loop; 
wait on AI,CLK; 
end process; 
end BEH; 


-------------------- Twice faster clock generator -------------------- 
entity clock is 
port(CLK :inout bit := °1’); 
end clock; 
architecture beh of clock is 
begin 
process(CLK) 
variable I ;: integer := 0; 
begin 
CLK <= not CLK after 2.5 ns; 
I:=I+ 1; 
assert I <= 160 
report "job done" 
severity Error; 
end process; 
end beh; 
sorecceeee Delay gate -------------------- 
entity delay10 is 
port(a : bits;b : out bit;CLK : bit); 
end delay10; 
architecture beh of delay10 is 
begin 
process 
variable x : bit; 
begin 
wait until CLK’event and CLK = ’1’; 
X:= a3 
b <= x; 
wait on CLK,a; 
end process; 
end beh; 
eo 2----------------- Parallel shift out 2-bit register ------------------- 
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entity shift is 
port(bi0, bil ,bi2,bi3,bi4,biS,bi6,bi7 : in bit_vector(11 downto 0); 
bo0,bol,bo2,bo3,b04,bo5,b06,bo07: out bit_vector(1 downto 0); 
CLK : in bit); 
end shift; 
architecture beh of shift is 
begin 
process 
variable I : integer := 0; 
begin 
wait for 90 ns; 
for r in 0 to 5 loop 
wait until CLK’event and CLK = ’1’; 
bo0(0) <= bid(D; 
bo0(1) <= bi0(I +1); 
bol(0) <= bil(D; 
bol(1) <= bil(I+1); 
bo2(0) <= bi2(D; 
bo2(1) <= bi2(I+ 1); 
bo3(0) <= bi3(D; 
bo3(1) <= bi3(I+1); 
bo4(0) <= bid(D; 
bo4(1) <= bid(I+1); 
bo5(0) <= bi5(D; 
boS(1) <= bi5(I +1); 
bo6(0) <= bi6(D; 
bo6(1) <= bi6(I+1); 
bo7(0) <= bi7(D; 
bo7(1) <= bi7(I +1); 
I:= I+ 2; 


I:= 0; 
wait on CLK,bi0,bil,bi2,bi3,bi4,bi5,bi6,bi7; 
end process; 
end beh; 
oo nena nn----- === =~ 2-bit adder/subtractor ------------------- 
entity adsu is 
port(a0,al,a2,a3,a4,a5,a6,a7 : bit_vector(1 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(1 downto 0); 
CLK, cr,st : bit); 
end adsu; 


el 


architecture beh of adsu is 
begin 
process 
variable cl,c2,c3,c4,c5,c6,c7,c8 : bit_vector(4 downto 0); 
variable d1,d2,d3,d4,d5,d6,d7,d8 : bit_vector(2 downto 0); 
variable el,e2,e3,e4,e5,e6,e7,e8 : bit; 
begin 
wait until CLK’event and CLK = ’1’; 
if cr’event then 
el := cr; e2 := cr; e3: 
end if; 
if st’event then 


cr; e4 := cr; 


e5 := st; e6 := st; e7 := st; e8 := st; 
end if; 
c1(0) := el; 
cl(1) := a0(0); 
c1(2) := a0(1); 
c1(3) := a7(0); 
c1(4) := a7(1); 
d1(0) := (c1(1) and (not c1(3)) and (not c1(0))) 


or (not cl(1) and cl(3) and (not c1(0))) 
or (not cl(1) and (not cl(3)) and cl1(0)) 
or (c1(1) and cl(3) and c1(0)); 
d1(1) := (not cl(2) and not c1(1) and c1(4) and not c1(0)) 
or (not cl(2) and cl(4) and not cl(3) and not c1(0)) 
or (cl1(2) and not c1(4) and not cl(3) and not c1(0)) 
or (cl1(2) and not cl(1) and not cl(4) and not c1(0)) 
or (not cl(2) and cl(1) and not cl(4) and cl(3)) 
or (c1(2) and cl(1) and cl(3) and cl(4)) 
or (not cl1(1) and not c1(2) and cl1(4) and not cl1(3)) 
or (not cl(1) and cl(2) and not cl(3) and not cl(4)) 
or (cl(1) and not cl(2) and not cl(4) and c1(0)) 
or (not cl1(2) and not c1(4) and c1(3) and c1(0)) 
or (c1(2) and cl(3) and cl1(4) and c1(0)) 
or (c1(2) and cl(1) and cl(4) and c1(0)); 
d1(2) := (c1(1) and c1(2) and cl1(3)) 
or (c1(1) and cl(3) and cl(4)) 
or (cl1(1) and cl(2) and c1(0)) 
or (c1(2) and cl(3) and c1(0)) 
or (cl(3) and cl(4) and cl1(0)) 
or (c1(2) and cl(4)) 
or (c1(1) and cl(4) and cl1(0)); 


72 


b0(0) <= d1(0); 
b0(1) <= d1(1); 


el := dl1(2); 

c2(0) := e2; 

c2(1) := al(0); 

e27@) := al()); 

c2(3) := a6(0); 

c2(4) := a6(1); 

d2(0) := (c2(1) and (not c2(3)) and (not c2(0))) 


or (not c2(1) and c2(3) and (not c2(0))) 
or (not c2(1) and (not c2(3)) and c2(0)) 
or (c2(1) and c2(3) and c2(0)); 
d2(1) := (not c2(2) and not c2(1) and c2(4) and not c2(0)) 
or (not c2(2) and c2(4) and not c2(3) and not c2(0)) 
or (c2(2) and not c2(4) and not c2(3) and not c2(0)) 
or (c2(2) and not c2(1) and not c2(4) and not c2(0)) 
or (not c2(2) and c2(1) and not c2(4) and c2(3)) 
or (c2(2) and c2(1) and ¢2(3) and c2(4)) 
or (not c2(1) and not c2(2) and c2(4) and not c2(3)) 
or (not c2(1) and c2(2) and not c2(3) and not c2(4)) 
or (c2(1) and not c2(2) and not c2(4) and c2(0)) 
or (not c2(2) and not c2(4) and c2(3) and c2(0)) 
or (c2(2) and ¢c2(3) and c2(4) and c2(0)) 
or (c2(2) and c2(1) and c2(4) and c2(0)); 
d2(2) := (c2(1) and c2(2) and c2(3)) 
or (c2(1) and c2(3) and c2(4)) 
or (c2(1) and c2(2) and c2(0)) 
or (c2(2) and c2(3) and c2(0)) 
or (c2(3) and c2(4) and c2(0)) 
or (c2(2) and c2(4)) 
or (c2(1) and c2(4) and c2(0)); 
b1(0) <= d2(0); 
b1(1) <= d2(1); 


e2 := d2(2); 
c3(0) := e3; 
c3(1) := a2(0); 
c3(2) := a2(1); 
c3(3) := a5(0); 


c3(4) := a5(1); 
d3(0) := (c3(1) and (not c3(3)) and (not c3(0))) 
or (not c3(1) and c3(3) and (not c3(0))) 
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or (not c3(1) and (not c3(3)) and c3(0)) 
or (c3(1) and c3(3) and c3(0)); 


d3(1) := (not c3(2) and not c3(1) and c3(4) and not c3(0)) 


or (not c3(2) and c3(4) and not c3(3) and not c3(0)) 
or (c3(2) and not c3(4) and not c3(3) and not c3(0)) 
or (c3(2) and not c3(1) and not c3(4) and not ¢c3(0)) 
or (not c3(2) and c3(1) and not c3(4) and c3(3)) 

or (c3(2) and c3(1) and c3(3) and c3(4)) 

or (not c3(1) and not c3(2) and c3(4) and not c3(3)) 
or (not c3(1) and c3(2) and not c3(3) and not c3(4)) 
or (c3(1) and not c3(2) and not c3(4) and c3(0)) 

or (not c3(2) and not c3(4) and c3(3) and c3(0)) 

or (c3(2) and c3(3) and c3(4) and c3(0)) 

or (c3(2) and c3(1) and c3(4) and c3(Q)); 


d3(2) := (c3(1) and c3(2) and c3(3)) 


or (c3(1) and c3(3) and c3(4)) 
or (c3(1) and c3(2) and c3(0)) 
or (c3(2) and c3(3) and c3(0)) 
or (c3(3) and c3(4) and c3(0)) 
or (c3(2) and c3(4)) 

or (c3(1) and c3(4) and c3(0)); 


b2(0) <= d3(0); 
b2(1) <= d3(1); 


e3 := d3(2); 
c4(0) := ed; 
c4(1) := a3(0); 
c4(2) := a3(1); 
c4(3) := a4(0); 
c4(4) := a4(1); 
d4(0) := (c4(1) and (not c4(3)) and (not c4(0))) 


or (not c4(1) and c4(3) and (not c4(0))) 
or (not c4(1) and (not c4(3)) and c4(0)) 
or (c4(1) and c4(3) and c4(0)); 

d4(1) := (not c4(2) and not c4(1) and c4(4) and not c4(0)) 
or (not c4(2) and c4(4) and not c4(3) and not c4(0)) 
or (c4(2) and not c4(4) and not c4(3) and not c4(0)) 
or (c4(2) and not c4(1) and not c4(4) and not c4(0)) 
or (not c4(2) and c4(1) and not c4(4) and c4(3)) 
or (c4(2) and c4(1) and c4(3) and c4(4)) 
or (not c4(1) and not c4(2) and c4(4) and not c4(3)) 
or (not c4(1) and c4(2) and not c4(3) and not c4(4)) 
or (c4(1) and not c4(2) and not c4(4) and c4(0)) 
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or (not c4(2) and not c4(4) and c4(3) and c4(0)) 
or (c4(2) and c4(3) and c4(4) and c4(0)) 
or (c4(2) and c4(1) and c4(4) and c4(Q)); 
d4(2) := (c4(1) and c4(2) and c4(3)) 
or (c4(1) and c4(3) and c4(4)) 
or (c4(1) and c4(2) and c4(0)) 
or (c4(2) and c4(3) and c4(0)) 
or (c4(3) and c4(4) and c4(0)) 
or (c4(2) and c4(4)) 
or (c4(1) and c4(4) and c4(0)); 
b3(0) <= d4(0); 
b3(1) <= d4(1); 


e4 := d4(2); 

c5(0) := e5; 

c5(1) := a3(0); 

c5(2) := a3(1); 

c5(3) := not a4(0); 

c5(4) := not a4(1); 

d5(0) := (c5(1) and (not c5(3)) and (not c5(0))) 


or (not c5(1) and c5(3) and (not c5(0))) 
or (not c5(1) and (not c5(3)) and c5(0)) 
or (cS(1) and c5(3) and c5(0)); 
d5(1) := (not c5(2) and not c5(1) and c5(4) and not c5(0)) 
or (not c5(2) and c5(4) and not c5(3) and not c5(0)) 
or (c5(2) and not c5(4) and not c5(3) and not c5(0)) 
or (c5(2) and not c5(1) and not c5(4) and not c5(0)) 
or (not c5(2) and c5(1) and not c5(4) and c5(3)) 
or (c5(2) and c5(1) and c5(3) and c5(4)) 
or (not c5(1) and not c5(2) and c5(4) and not c5(3)) 
or (not c5(1) and c5(2) and not c5(3) and not c5(4)) 
or (c5(1) and not c5(2) and not c5(4) and c5(0)) 
or (not c5(2) and not c5(4) and c5(3) and c5(0)) 
or (c5(2) and c5(3) and c5(4) and c5(0)) 
or (c5(2) and c5(1) and c5(4) and c5(0)); 
d5(2) := (c5(1) and c5(2) and c5(3)) 
or (c5(1) and c5(3) and c5(4)) 
or (c5(1) and c5(2) and c5(0)) 
or (c5(2) and c5(3) and c5(0)) 
or (c5(3) and c5(4) and c5(0)) 
or (c5(2) and c5(4)) 
or (c5(1) and c5(4) and c5(0)); 
b4(0) <= d5(0); 


is 


b4(1) <= d5(1); 
e5 := d5(2); 
c6(0) := e6; 
c6(1) := a2(0); 
c6(2) := a2(1); 
c6(3) := not a5(0); 
c6(4) := not a5(1); 
d6(0) := 
or (not c6(1) and c6(3) and (not c6(0))) 
or (not c6\.) and (not c6(3)) and c6(0)) 
or (c6(1) and c6(3) and c6(0)); 
d6(1) := (not c6(2) and not c6(1) and c6(4) and not c6(0)) 
or (not c6(2) and c6(4) and not c6(3) and not c6(0)) 
or (c6(2) and not c6(4) and not c6(3) and not c6(0)) 
or (c6(2) and not c6(1) and not c6(4) and not c6(0)) 
or (not c6(2) and c6(1) and not c6(4) and c6(3)) 
or (c6(2) and c6(1) and c6(3) and c6(4)) 
or (not c6(1) and not c6(2) and c6(4) and not c6(3)) 
or (not c6(1) and c6(2) and not c6(3) and not c6(4)) 
or (c6(1) and not c6(2) and not c6(4) and c6(0)) 
or (not c6(2) and not c6(4) and c6(3) and c6(0)) 
or (c6(2) and c6(3) and c6(4) and c6(0)) 
or (c6(2) and c6(1) and c6(4) and c6(0)); 
d6(2) := (c6(1) and c6(2) and c6(3)) 
or (c6(1) and c6(3) and c6(4)) 
or (c6(1) and c6(2) and c6(0)) 
or (c6(2) and c6(3) and c6(0)) 
or (c6(3) and c6(4) and c6(0)) 
or (c6(2) and c6(4)) 
or (c6(1) and c6(4) and c6(0)); 
b5(0) <= d6(0); 
b5(1) <= d6(1); 


e6 := d6(2); 

c7(0) := e7; 

c7(1) := al (0); 

c7(2) := al(l); 

c7(3) := not a6(0); 

c7(4) := not a6(1); 

d7(0) := (c7(1) and (not c7(3)) and (not c7(0))) 


or (not c7(1) and c7(3) and (not c7(0))) 
or (not c7(1) and (not c7(3)) and c7(0)) 
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or (c7(1) and c7(3) and c7(0)); 

d7(1) := (not c7(2) and not c7(1) and c7(4) and not c7(0)) 
or (not c7(2) and c7(4) and not c7(3) and not c7(0)) 
or (c7(2) and not c7(4) and not c7(3) and not c7(0)) 
or (c7(2) and not c7(1) and not c7(4) and not c7(0)) 
or (not c7(2) and c7(1) and not c7(4) and c7(3)) 
or (c7(2) and c7(1) and c7(3) and c7(4)) 
or (not c7(1) and not c7(2) and c7(4) and not c7(3)) 
or (not c7(1) and c7(2) and not c7(3) and not c7(4)) 
or (c7(1) and not c7(2) and not c7(4) and c7(0)) 
or (not c7(2) and not c7(4) and ¢c7(3) and c7(0)) 
or (c7(2) and c7(3) and c7(4) and c7(0)) 
or (c7(2) and c7(1) and c7(4) and c7(0)); 

d7(2) := (c7(1) and c7(2) and c7(3)) 
or (c7(1) and c7(3) and c7(4)) 
or (c7(1) and c7(2) and c7(0)) 
or (c7(2) and c7(3) and c7(0)) 
or (c7(3) and c7(4) and c7(0)) 
or (c7(2) and c7(4)) 
or (c7(1) and c7(4) and c7(0)); 

b6(0) <= d7(0); 

b6(1) <= d7(1); 


e7 := d7(2); 

c8(0) := e8; 

c8(1) := a0(0); 

c8(2) := a0(); 

c8(3) := not a7(0); 

c8(4) := not a7(1); 

d8(0) := (c8(1) and (not c8(3)) and (not c8(0))) 


or (not c8(1) and c8(3) and (not c8(0))) 
or (not c8(1) and (not c8(3)) and c8(0)) 
or (c8(1) and c8(3) and c8(0)); 

d8(1) := (not c8(2) and not c8(1) and c8(4) and not c8(0)) 
or (not c8(2) and c8(4) and not c8(3) and not c8(0)) 
or (c8(2) and not c8(4) and not c8(3) and not c8(0)) 
or (c8(2) and not c8(1) and not c8(4) and not c8(0)) 
or (not c8(2) and c8(1) and not c8(4) and c8(3)) 
or (c8(2) and c8(1) and c8(3) and c8(4)) 
or (not c8(1) and not c8(2) and c8(4) and not c8(3)) 
or (not c8(1) and c8(2) and not c8(3) and not c8(4)) 
or (c8(1) and not c8(2) and not c8(4) and c8(0)) 
or (not c8(2) and not c8(4) and c8(3) and c8(0)) 
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or (c8(2) and c8(3) and c8(4) and c8(0)) 
or (c8(2) and c8(1) and c8(4) and c8(0)); 
d8(2) := (c8(1) and c8(2) and c8(3)) 
or (c8(1) and c8(3) and c8(4)) 
or (c8(1) and c8(2) and c8(0)) 
or (c8(2) and c8(3) and c8(0)) 
or (c8(3) and c8(4) and c8(0)) 
or (c8(2) and c8(4)) 
or (c8(1) and c8(4) and c8(0)); 
b7(0) <= d8(0); 
b7(1) <= d8(1); 
e8 := d8(2); 
wait on a0,al,a2,a3,a4,a5,a6,a7,CLK,cr,st; 
end process; 
end beh; 


entity reg is 
port(a0,a1,a2,a3,a4,a5,a6,a7 : bit_vector(1 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(1 downto 0); 
CLK : bit); 
end reg; 
architecture beh of reg is 
begin 
process 
variable d0,d1,d2,d3,d4,d5,d6,d7: bit_vector(1 downto 0); 
begin 
dQ: 
dl: 
d2 e 
d3 
d4; 
d5: 
d6 := a6; 
d7 := a7; 
wait until CLK’event and CLK = ’1’; 
b0O <= dQ; 
bl <= dl; 
b2 <= d2; 
b3 <= d3; 
b4 <= d4; 
b5 <= d5; 
b6 <= d6; 
b7 <= d7; 


a0; 
al; 
a2; 
a3; 
a4; 
a5; 


& 


78 


wait on CLK; 
end process; 

end beh; 

entity rom is 

port(e0,e1,e2,e3,e4,e5,e6,e7 : bit_vector(1 downto 0); 

b10,b11,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out bit_vector(15 downto 0); 
CLK : bit); 


end rom; 
architecture beh of rom is 


begin 


process 
variable a10,a11,a20,a21,a30,a31,a40,a41,a50,a51,260,a61,a70,a71, 
a80,a81 : bit_vector(3 downto 0); 


begin 
wait until CLK’event and CLK = ’1’; 


al0(3) : 
al1(3) : 
a20(3) : 
a21(3) : 
a30(3) : 
a31(3) : 
a40(3) : 
a41(3) : 
a50(3) : 
a51(3) : 
a60(3) : 
a61(3) : 
a70(3) : 
a71(3) : 
a80(3) : 


e0(0); al0(2) : 
e0(1); al1(2) : 
e7(0); a20(2) : 
e7(1); a21(2) : 
e0(0); a30(2) : 
e0(1); a31(2) : 
e7(0); a40(2) : 
e7(1); a41(2) : 
e0(0); a50(2) : 
e0(1); a51(2) : 
e7(0); a60(2) : 
e7(1); a61(2) : 
e0(0); a70(2) : 
e0(1); a71(2) : 
e7(0); a80(2) : 


ud vu dt dt bb td td de ud eal 


e1(0); al0(1) : 
el(1); al1(1): 
e6(0); a20(1) : 
e6(1); a21(1) : 
e1(0); a30(1) : 
e1(1); a31(1) : 
e6(0); a40(1) : 
e6(1); a41(1) : 
e1(0); a50(1) : 
el(1); aS1(1) : 
e6(0); a60(1) : 
e6(1); a61(1) : 
e1(0); a70(1) : 
e1(1); a71(1) : 
e6(0); a80(1) : 


e2(0); al0(0) : 
e2(1); al1(0) : 
e5(0); a20(0) : 
e5(1); a21(0) : 
e2(0); a30(0) : 
e2(1); a31(0) : 
e5(0); a40(0) : 
e5(1); a41(0) : 
e2(0); a50(0) : 
e2(1); a51(0) : 
e5(0); 260(0) : 
e5(1); a61(0) : 
e2(0); a70(0) : 
e2(1); a71(0) : 
e5(0); a80(0) : 


e3(0); 
e3(1); 
e4(0); 
e4(1); 
e3(0)5 
e3(1); 
e4(0); 
e4(1); 
e3(0); 
e3(1); 
e4(0); 
e4(1); 
e3(0); 
e3(1); 
e4(0); 


a81(3) := e7(1); a81(2) := e6(1); a81(1) := e5(1); a81(0) : 


e4(1); 


when "0000" 
when "0001" 
when "0010" 
when "0011" 
when "0100" 
when "0101" 
when "0110" 
when "0111" 


=> bl10 <= "0000000000000000"; 
=> b10 <= "0001011010100000"; 
=> b10 <= "0001011010100000"; 
=> b10 <= "0010110101000001"; 
=> b10 <= "0001011010100000"; 
=> b10 <= "0010110101000001"; 
=> b10 <= "0010110101000001"; 
=> b10 <= "0100001111100001"; 


Us, 


when "1000" => b10 <= "0001011010100000"; 
when "1001" => b10 <= "0010110101000001"; 
when "1010" => b10 <= "0010110101000001"; 
when "1011" => b10 <= "0100001111100001"; 
when "1100" => b10 <= "0010110101000001"; 
when "1101" => b10 <= "0100001111100001"; 
when "1110" => b10 <= "0100001111100001"; 
when "1111" => b10 <= "0101101010000010"; 
end case; 

case all is 


when "0000" => bl1 <= "0000000000000000"; 
when "0001" => bl11l <= "0001011010100000"; 
when "0010" => bl11l <= "0001011010100000"; 
when "0011" => b11 <= "0010110101000001"; 
when "0100" => bl11l <= "0001011010100000"; 
when "0101" => bl11 <= "0010110101000001"; 
when "0110" => blll <= "0010110101000001"; 
when "0111" => b11l <= "0100001111100001"; 
when "1000" => bill <= "0001011010100000"; 
when "1001" => bl11l <= "0010110101000001"; 
when "1010" => bl11 <= "0010110101000001"; 
when "1011" => blll <= "0100001111100001"; 
when "1100" => blll <= "0010110101000001"; 
when "1101" => b11l <= "0100001111100001"; 
when "1110" => b11 <= "0100001111100001"; 
when "1111" => blll <= "0101101010000010"; 


end case; 

case a20 is 

when "0000" => b20 <= "0000000000000000"; 
when "0001" => b20 <= "0000011000111110"; 
when "0010" => b20 <= "0001000111000111"; 
when "0011" => b20 <= "0001100000000101"; 
when "0100" => b20 <= "0001101010011011"; 
when "0101" => b20 <= "0010000011011001"; 
when "0110" => b20 <= "0010110001100010"; 
when "0111" => b20 <= "0011001010100000"; 
when "1000" => b20 <= "0001111101100010"; 
when "1001" => b20 <= "0010010110100000"; 
when "1010" => b20 <= "0011000100101001"; 
when "1011" => b20 <= "0011011101101000"; 
when "1100" => b20 <= "0011100111111101L"; 
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when "1101" => b20 <= "0100000000111100"; 
when "1110" => b20 <= "0100101111000101"; 
when "1111" => b20 <= "0101001000000011"; 
end case; 

case a21 is 

when "0000" => b21 <= "0000000000000000"; 
when "0001" => b21 <= "0000011000111110"; 
when "0010" => b21 <= "0001000111000111"; 
when "0011" => b21 <= "0001100000000101"; 
when "0100" => b21 <= "0001101010011011"; 
when "0101" => b21 <= "0010000011011001"; 
when "0110" => b21 <= "0010110001100010"; 
when "0111" => b21 <= "0011001010100000"; 
when "1000" => b21 <= "0001111101100010"; 
when "1001" => b21 <= "0010010110100000"; 
when "1010" => b21 <= "0011000100101001"; 
when "1011" => b21 <= "0011011101101000"; 
when "1100" => b21 <= "0011100111111101"; 
when "1101" => b21 <= "0100000000111100"; 
when "1110" => b21 <= "0100101111000101"; 
when "1111" => b21 <= "0101001000000011"; 


end case; 

case a30 is 

when "0000" => b30 <= "0000000000000000"; 
when "0001" => b30 <= "1110001001110000"; 
when "0010" => b30 <= "1111001111000010"; 
when "0011" => b30 <= "1101011000110001"; 
when "0100" => b30 <= "0000110000111110"; 
when "0101" => b30 <= "1110111010101111"; 
when "0110" => b30 <= "0000000000000000"; 
when "0111" => b30 <= "1110001001110000"; 
when "1000" => b30 <= "0001110110010000"; 
when "1001" => b30 <= "0000000000000000"; 
when "1010" => b30 <= "0001000101010001"; 
when "1011" => b30 <= "1111001111000010"; 
when "1100" => b30 <= "0010100111001111"; 
when "1101" => b30 <= "0000110000111110"; 
when "1110" => b30 <= "0001110110010000"; 
when "1111" => b30 <= "0000000000000000"; 
end case; 
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case a31 is 


when "0000" => b31 <= "0000000000000000"; 
when "0001" => b31 <= "1110001001110000"; 
when "0010" => b31 <= "1111001111000010"; 
when "0011" => b31 <= "1101011000110001"; 
when "0100" => b31 <= "0000110000111110"; 
when "0101" => b31 <= "1110111010101111"; 
when "0110" => b31 <= "0000000000000000"; 
when "0111" => b31 <= "1110001001110000"; 
when "1000" => b31 <= "0001110110010000"; 
when "1001" => b31 <= "0000000000000000"; 
when "1010" => b31 <= "0001000101010001"; 
when "1011" => b31 <= "1111001111000010"; 
when "1100" => b31 <= "0010100111001111"; 
when "1101" => b31 <= "0000110000111110"; 
when "1110" => b31 <= "0001110110010000"; 
when "1111" => b31 <= "0000000000000000":; 
end case; 

case a40 is 

when "0000" => b40 <= "0000000000000000"; 
when "0001" => b40 <= "1110111000111001"; 
when "0010" => b40 <= "1110000010011110"; 
when "0011" => b40 <= "1100111011010111"; 
when "0100" => b40 <= "1111100111000010"; 
when "0101" => b40 <= "1110011111111011"; 
when "0110" => b40 <= "1101101001100000"; 
when "0111" => b40 <= "1100100010011000"; 
when "1000" => b40 <= "0001101010011011"; 
when "1001" => b40 <= "0000100011010100"; 
when "1010" => b40 <= "1111101100111001"; 
when "1011" => b40 <= "1110100101110010"; 
when "1100" => b40 <= "0001010001011101"; 
when "1101" => b40 <= "0000001010010101"; 
when "1110" => b40 <= "1111010011111011"; 
when "1111" => b40 <= "1110001100110100"; 
end case; 

case a4]1 is 

when "0000" => b41 <= "0000000000000000"; 
when "0001" => b41 <= "1110111000111001"; 


when "0010" => b41 <= "1110000010011110"; 
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when "0011" => b41 <= "1100111011010111"; 
when "0100" => b41 <= "1111100111000010"; 
when "0101" => b41 <= "1110011111111011"; 
when "0110" => b41 <= "1101101001100000"; 
when "0111" => b41 <= "1100100010011000"; 
when "1000" => b41 <= "0001101010011011"; 
when "1001" => b41 <= "0000100011010100"; 
when "1010" => b41 <= "1111101100111001"; 
when "1011" => b41 <= "1110100101110010"; 
when "1100" => b41 <= "0001010001011101"; 
when "1101" => b41 <= "0000001010010101"; 
when "1110" => b41 <= "1111010011111011"; 
when "1111" => b41 <= "1110001100110100"; 
end case; 

case a50 is 

when "0000" => b50 <= "0000000000000000"; 
when "0001" => b50 <= "0001011010100000"; 
when "0010" => b50 <= "1110100101100000"; 
when "0011" => b50 <= "0000000000000000"; 
when "0100" => b50 <= "1110100101100000"; 
when "0101" => b50 <= "0000000000000000"; 
when "0110" => b50 <= "1101001010111111"; 
when "0111" => b50 <= "1110100101100000"; 
when "1000" => b50 <= "0001011010100000"; 
when "1001" => b50 <= "0010110101000001"; 
when "1010" => b50 <= "0000000000000000"; 
When "1011" => b50 <= "0001011010100000"; 
When "1100" => b50 <= "0000000000000000"; 
When "1101" => b50 <= "0001011010100000"; 
When "1110" => b50 <= "1110100101100000"; 
When "1111" => b50 <= "0000000000000000"; 
end case; 

case a51 is 

when "0000" => b51 <= "0000000000000000"; 
when "0001" => b51 <= "0001011010100000"; 
when "0010" => b51 <= "1110100101100000"; 
when "0011" => b51 <= "0000000000000000"; 
when "0100" => b51 <= "1110100101100000"; 
when "0101" => b51 <= "0000000000000000"; 
when "0110" => b51 <= "1101001010111111"; 
when "0111" => b51 <= "1110100101100000"; 
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when "1000" => b51 <= "0001011010100000"; 
when "1001" => b51 <= "0010110101000001"; 
when "1010" => b51 <= "0000000000000000"; 
When "1011" => b51 <= "0001011010100000"; 
When "1100" => b51 <= "0000000000000000"; 
When "1101" => b51 <= "0001011010100000"; 
When "1110" => b51 <= "1110100101100000"; 
When "1111" => b51 <= "0000000000000000"; 
end case; 

case 260 is 

when "0000" => b60 <= "0000000000000000"; 
when "0001" => b60 <= "0001101010011011"; 
when "0010" => b60 <= "0000011000111110"; 
when "0011" => b60 <= "0010000011011001"; 
when "0100" => b60 <= "1110000010011110"; 
when "0101" => b60 <= "1111101100111001"; 
when "0110" => b60 <= "1110011011011100"; 
when "0111" => b60 <= "0000000101110110"; 
when "1000" => b60 <= "0001000111000111"; 
when "1001" => b60 <= "0010110001100010"; 
when "1010" => b60 <= "0001100000000101"; 
When "1011" => b60 <= "0011001010100000"; 
When "1100" => b60 <= "1111001001100101"; 
When "1101" => b60 <= "0000110100000000"; 
When "1110" => b60 <= "1111100010100011"; 
When "1111" => b60 <= "0001001100111110"; 
end case; 

case a61 is 

when "0000" => b61 <= "0000000000000000"; 
when "0001" => b61 <= "0001101010011011"; 
when "0010" => b61 <= "0000011000111110"; 
when "0011" => b61 <= "0010000011011001"; 
when "0100" => b61 <= "1110000010011110"; 
when "0101" => b61 <= "1111101100111001"; 
when "0110" => b61 <= "1110011011011100"; 
when "0111" => b61 <= "0000000101110110"; 
when "1000" => b61 <= "0001000111000111"; 
when "1001" => b61 <= "0010110001100010"; 
when "1010" => b6l <= "0001100000000101"; 
When "1011" => b61 <= "0011001010100000"; 
When "1100" => b61l <= "1111001001100101"; 


84 


When "1101" => b61 <= "0000110100000000"; 
When "1110" => b61 <= "1111100010100011"; 
When "1111" => b61 <= "0001001100111110"; 
end case; 

case a70 is 

when "0000" => b70 <= "0000000000000000"; 
when "0001" => b70 <= "1111001111000010"; 
when "0010" => b70 <= "0001110110010000"; 
when "0011" => b70 <= "0001000101010001"; 
when "0100" => b70 <= "1110001001110000"; 
when "0101" => b70 <= "1101011000110001"; 
when "0110" => b70 <= "0000000000000000"; 
when "0111" => b70 <= "1111001111000010"; 
when "1000" => b70 <= "0000110000111110"; 


when "1001" => b70 <= "0000000000000000"; 
when "1010" => b70 <= "0010100111001111"; 
When "1011" => b70 <= "0001110110010000"; 
When "1100" => b70 <= "1110111010101111"5 


When "1101" => b70 <= "1110001001110000"; 
When "1110" => b70 <= "0000110000111110"; 
When "1111" => b70 <= "0000000000000000"; 
end case; 

case a71 is 

when "0000" => b71 <= "0000000000000000"; 
when "0001" => b71 <= "1111001111000010"; 
when "0010" => b71 <= "0001110110010000"; 
when "0011" => b71 <= "0001000101010001"; 
when "0100" => b71 <= "1110001001110000"; 
when "0101" => b71 <= "1101011000110001"; 
when "0110" => b71 <= "0000000000000000"; 
when "0111" => b71 <= "1111001111000010"; 
when "1000" => b71 <= "0000110000111110"; 
when "1001" => b71 <= "0000000000000000"; 
when "1010" => b71 <= "0010100111001111"; 
When "1011" => b71 <= "0001110110010000"; 
When "1100" => b71 <= "1110111010101111"; 
When "1101" => b71 <= "1110001001110000"; 
When "1110" => b71 <= "0000110000111110"; 
When "1111" => b71 <= "0000000000000000"; 
end case; 
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case a80 is 

when "0000" => b80 <= "0000000000000000"; 
when "0001" => b80 <= "1110000010011110"; 
when "0010" => b80 <= "0001101010011011"; 


when "0011" => b80 <= "1111101100111001"; 
when "0100" => b80 <= "1110111000111001"; 
when "0101" => b80 <= "1100111011010111"; 
when "0110" => b80 <= "0000100011010100"; 
when "0111" => b80 <= "1110100101110010"; 
when "1000" => b80 <= "0000011000111110"; 
when "1001" => b80 <= "1110011011011100"; 
when "1010" => b80 <= "0010000011011001"; 
When "1011" => b80 <= "0000000101110110"; 
When "1100" => b80 <= "1111010001110111"; 
When "1101" => b80 <= "1101010100010101"; 
When "1110" => b80 <= "0000111100010010"; 
When "1111" => b80 <= "1110111110110000"; 
end case; 

case a81 is 

when "0000" => b81 <= "0000000000000000"; 
when "0001" => b81 <= "1110000010011110"; 
when "0010" => b81 <= "0001101010011011"; 
when "0011" => b81 <= "1111101100111001"; 
when "0100" => b81 <= "1110111000111001"; 
when "0101" => b81 <= "1100111011010111"; 
when "0110" => b81 <= "0000100011010100"; 
when "0111" => b81 <= "1110100101110010"; 
when "1000" => b81 <= "0000011000111110"; 
when "1001" => b81 <= "1110011011011100"; 
when "1010" => b81 <= "0010000011011001"; 
When "1011" => b81 <= "0000000101110110"; 
When "1100" => b81 <= "1111010001110111"; 
When "1101" => b81 <= "1101010100010101"; 
When "1110" => b81 <= "0000111100010010"; 
When "1111" => b81 <= "1110111110110000"; 
end case; 


wait on e0,el1,e2,e3,e4,e5,e6,e7, CLK; 
end process; 
end beh; 


entity shi_1 is 
port(f1,f2,f3,f£4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15, £16: 
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bit_vector(15 downto 0); 
b10,b11,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out bit_vector(15 downto 0); 
CLK : bit); 
end shi 1; 
architecture beh of shi_1 is 
begin 
process 
variable al,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
begin 
wait until CLK’event and CLK = ’1’; 
if f1(15) =’0’ then 
al(15) := ’0’; 


a1(15) := ’1’; 


a1(14) := f1(15); 
al(13) := f1(14); 
al(12) := f1(13); 
al(11) := f1(12); 
al(10) := f1(11); 
al(9) := f1(10); 
al(8) := f1(9); 
al(7) := f1(8); 
al(6) := f1(7); 
al(5) := f1(6); 
al(4) := f1(5); 
al(3) := f1(4); 
al(2) := f1(3); 


b10 <= al; 

bl1l <= f2; 
if £3(15) = ’0’ then 
a2(15) := ’0’; 

else 

a2(15) := ’1’; 
end if; 


a2(14) := f3(15); 
a2(13) := £3(14); 
a2(12) := £3(13); 
a2(11) := £3(12); 
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a2(10) := f3(1)); 
a2(9) := £3(10); 
a2(8) := £3(9); 
a2(7) := £3(8); 
a2(6) := £3(7); 
a2(5) := £3(6); 
a2(4) := 
a2(3) := £3(4); 
a2(2) := £3(3); 
a2(1) := £3(2); 
a2(0) := f3(1); 


b21 <= f4; 
if £5(15) = ’0’ then 
a3(15) := ’0’; 
else 
a3(15) := ’1’; 
end if; 
a3(14) := £5(15); 
a3(13) := £5(14); 
a3(12) := £5(13); 
a3(11) := £5(12); 
a3(10) := £5(11); 
a3(9) := £5(10); 
a3(8) := £5(9); 
a3(7) := £5(8); 
a3(6) := £5(7); 
a3(5) : 
a3(4) : 
a3(3) := £5(4); 


a3(0) := £5(1); 
b30 <= a3; 
b31 <= f6; 
if f7(15) = ’0’ then 
a4(15) := °0’; 
else 
a4(15) := ’1’ 
end if; 


a4(14) := f7(15); 
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a4(13) := £7(14); 


b40 <= a4; 
b41 <= f8; 
if f9(15) = ’0’ then 
a5(15) := 0’; 
else 
a5(15) := ’1’; 
end if; 
a5(14) := £9(15); 
a5(13) := £9(14); 


a5(0) := £9(1); 
b50 <= a5; 
b51 <= f10; 


if f11(15) = ’0’ then 
a6(15) := ’0’; 
else 
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a6(15) := ’1’; 

end if; 
a6(14) := f11(15); 
a6(13) := f11(14); 
a6(12) := f11(13); 
a6(11) := f11(12); 
a6(10) := f11(1); 
a6(9) := f11(10); 
a6(8) := f11(9); 
a6(7) := 
a6(6) := f11(7); 
a6(5) := f11(6); 
a6(4) := f11(5); 
a6(3) := f11(4); 
a6(2) := f11(3); 
a6(1) := 
a6(0) := f11(1); 


if f£13(15) = ’0’ then 

a7(15) := ’0’; 

else 

a7(15) := °1’; 

end if; 
a7(14) := f13015); 
a7(13) := f13(14); 
a7(12) := £13(13); 
a7(11) := £13(12); 
a7(10) := f13(11); 
a7(9) := £13(10); 
a7(8) := 
a7(7) := f13(8); 
a7(6) := £13(7); 
a7(5) := £13(6); 
a7(4) := f13(5); 
a7(3) := f13(4); 
a7(2) := f13(3); 
a7(1) := f13(2); 
a7(0) := f13(1); 
b70 <= a7; 
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if f15(15) = ’0’ then 
a8(15) := ’0’; 

else 

a8(15) := ’1’; 

end if; 
a8(14) := f15(15); 
a8(13) := f15(14); 
a8(12) := f15(13); 
a8(11) := £15(12); 


a8(10) := f15(11); 
a8(9) ._— £15(10); 
a8(8) := f15(9); 
a8(7) := £15(8); 
a8(6) := £15(7); 
a8(5) := f15(6); 
a8(4) := f15(5); 
a8(3) := f15(4); 
a8(2) := f15(3); 
a8(1) := f15(2); 
a8(0) := f15(1); 
b80 <= a8; 

b81 <= f16; 


wait on f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16,CLK; 
end process; 
end beh; 
se enenen nee nnncncen= Package 1 -------------------- 
package packl is 
procedure bi _to_in --change 16 bits(1 sign,1 integer and 14 fraction into real) 
(variable x : bit_vector(15 downto 0); 
variable y : out integer); 
procedure in_to_bi --change real into binary(1 sign,1 integer,14 fractions). 
(variable m ; in integer; 
variable n : out bit_vector(15 downto 0)); 
end pack]; 
package body packl1 is 
procedure bi_to_in 
(variable x : bit_vector(15 downto 0); 
variable y : out integer) is 
variable sum : integer :=0; 
variable p : bit_vector(15 downto 9); 
begin 
Pp := X; 
if p(15) = ’1’ then 
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for iin 0 to 14 loop 
if p(i) = ’1’ then 
for i in 0 to 13 loop 
p(iit+l) := not p(it+)D; 
end loop; exit; 
end if; 
end loop; 
for k in 0 to 14 loop 
if p(k) = ’1’ then 
sum := sum + 2**k; 
end if; 
end loop; 
y -= -sum, 
else 
for | in 0 to 14 loop 
if p(l) = ’1’ then 
sum := sum + 2**]; 
end if; 
end loop; 
y := sum; 
end if; 
end bi_to_in; 
procedure in to bi 
(variable m ; in integer; 
variable n : out bit_vector(15 downto 0)) is 
variable temp _a : integer := 0; 
variable temp b : integer := 0; 
variable w : bit_vector(15 downto 0); 
begin 
ifm < 0 then 
temp_a:= -m; 
else 
temp_a := m; 
end if; 
for i in 14 downto 0 loop 
temp b := temp_a/(2**i); 
temp a := temp _a rem (2**i); 
if (temp_b = 1) then 
w(i) s= 71’; 
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end loop; 


ifm > 0 then 
w(15) := 0’; 
else 


w(15) := ’°1’3 
for k in 0 to 14 loop 
if w(k) = ’1’ then 
for k in 0 to 13 loop 
w(k+1) := not w(k+1); 
end loop; exit; 
end if; 
end loop; 
end if; 
-- prevent negative zero occurs. 
if w(14)=’0’ and w(13)=’0’ and w(12)=’0’ and w(11)=’0’ and 
w(10)=’0’ and 
w(9)=’0’ and w(8)=’0’ and w(7)=’0’ and w(6)=’0’ and w(5)=’0’ and 
w(4)=’0’ and w(3)=’0’ and w(2)=’0’ and w(1)=’0’ and w(0)=’0’ then 
w(15) := ’0’; 
end if; 
n 3:= W3 
end in_to_bi; 
end pack]; 
wo-n nnn ----------- -- 16-bit adder _g -------------------- 
use work.pack1.all; 
entity add _g is 
port(al ,a2,a3,a4,a5,a6,a7,a8,a9,a10,al1,al12,a13,a14,a15,a16: 
bit_vector(15 downto 0); 
b1,b2,b3,b4,b5,b6,b7,b8 : out bit_vector(15 downto 0); 
CLK,as : bit); 
end add _g; 
architecture beh of add _ g is 
begin 
process 
variable x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16, 
n1,n2,n3,n4,n5,n6,n7,n8 : bit_vector(15 downto 0); 
variable y1,y2,y3,y4,y5,y6,y7,y8,y9,y10,y11,y12,y13,y14,y15,y16, 
m1,m2,m3,m4,m5,m6,m7,m8 : integer := 0; 
begin 
wait until CLK’event and CLK = °1’; 
xl := al; x2 := a2; x3 := a3; x4; 
x5 := a5; x6:= a6; x7:= a7; x8: 


a4; 
a8; 


| 
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x9 := a9; x10 := a10; x11 := all; x12 := al2; 
X13 := al3; x14 := al4; x15 := a15; x16 := al6; 
bi_to_in(xl,y1);bi_to_in(x2,y2);bi_to_in(x3,y3);bi_to_in(x4,y4); 
bi_to_in(x5,y5);bi_to_in(x6,y6);bi_to_in(x7,y7);bi_to_in(x8,y8); 
bi_to_in(x9,y9);bi_to_in(x10,y10);bi_to_in(x11,y11); 
bi_ to in(x12,y12); 
bi_to_in(x13,y13);bi_to_in(x14,y14);bi_to_in(x15,y15); 
bi_to_in(x16,y16); 
if as = ’0’ then 
ml := yl + y2; m2 := y3 + y4; m3 := y5 + y6; m4 := y7 + y8; 
m5 := y9 + yl0; m6 := yll + yl2; m7 := yl3 + yl14; m8 := y15 + yl16; 
else 
ml := yl - y2; m2 := y3 - y4; m3 := yS5 - y6; m4 := y7 - y8; 
m5 := y9 - y10; m6 := yl1l - y12; m7 := yl13 - y14; m8 := yl15 - y16; 
end if; 
in to bi(m1,n1); in_to_bi(m2,n2); in_to_bi(m3,n3); in_to_bi(m4,n4); 
in_to_bi(m5,n5); in_to_bi(m6,n6); in_to_bi(m7,n7); in_to_bi(m8,n8); 
bl <=nl; b2 <= n2; b3 <= n3; b4 <= n4; 
b5 <= n5; b6 <= n6; b7 <= n7; b8 <= n8; 
wait on al,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,CLK; 
end process; 
end beh; 


entity reg h is 
port(a0,al1,a2,a3,a4,a5,a6,a7 : bit_vector(15 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(15 downto 0); 
CLK : bit); 


end reg _h; 
architecture beh of reg h is 
begin 
process 
variable d0,d1,d2,d3,d4,d5,d6,d7 : bit_vector(15 downto 0); 
begin 
d0 := a0; 
dl := al; 
d2 := a2; 
d3 := a3; 
d4 := a4; 
dS := a5; 
d6 := a6; 
d7 := a7; 
wait until CLK’event and CLK = ’1’; 
b0O <= dO; 
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bl <= dl; 
b2 <= d2; 
b3 <= d3; 
b4 <= d4; 
b5 <= d5; 
b6 <= d6; 
b7 <= d7; 
wait on CLK; 
end process; 
end beh; 
-n------------------ Adder i -------------------- 


use work.pack1.all; 
entity add_i is 
port(al ,a2,a3,a4,a5,a6,a7,a8,a9,a10,al1,al2,al3,al14,a15,al16: 
bit_vector(15 downto 0); 
b1,b2,b3,b4,b5,b6,b7,b8 : out bit_vector(15 downto 0); 
CLK : bit); 
end add _ i; 
architecture beh of add_i is 
begin 
process 
variable x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16, 
n1,n2,n3,n4,n5,n6,n7,n8 ; bit_vector(15 downto 0); 
variable yl,y2,y3,y4,y5,y6,y7,y8,y9,y10,y11,y12,y13,y14,y15,y16, 
m1,m2,m3,m4,m5,m6,m7,m8 ; integer := 0; 


begin 
xl := al; x2 := a2; x3 := a3; x4:= a4; 
x5 := a5; x6 := a6; x7:= a7; x8 := a8; 


x9 := a9; x10 := al0; x11 := all; x12 := al2; 

x13 := al3; x14 := al4; x15 := a15; x16 := al6; 
bi_to_in(x1,y1);bi_to_in(x2,y2);bi_to_ in(x3,y3);bi_to_in(x4,y4); 
bi_to_in(x5,y5);bi_to_in(x6,y6);bi_to_in(x7,y7);bi_to_in(x8,y8); 
bi_to_in(x9,y9);bi_to_in(x10,y10);bi_to_ in(x11,y11); 

bi_to_in(x12,y12); 

bi_to_in(x13,y13);bi_to_in(x14,y14);bi_ to in(x15,y15); 

bi_to_in(x16,y16); 

ml := yl + y2; m2 := y3 + y4; m3 := y5 + y6; m4 := y7 + y8; 

m5 := y9 + y10; m6 := yll + yl2; m7 := yl3 + y14; m8 := y15 + yl16; 


in_to_bi(m1,n1); in_to_bi(m2,n2); in_to_bi(m3,n3); in_to_bi(m4,n4); 
in to_bi(m5,n5); in _to_bi(m6,n6); in_to_bi(m7,n7); in_to_bi(m8,n8); 


bl <=nl; b2 <= n2; b3 <= n3; b4 <= nd; 


95 


b5 <= n5; b6 <= n6; b7 <= n7; b&8 <= n8; 


wait on al,a2,a3,a4,a5,a6,a7,a8,a9,a10,all,al2,a13,a14,a15,al16; 
end process; 
end beh; 


entity shi 2 is 
port(al,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,b1,b2,b3,b4,b5,b6,b7,b8 : 
out bit_vector( 15 downto 0);clr : bit_vector(15 downto 0); 
CLK : bit); 
end shi_ 2; 
architecture beh of shi_2 is 
begin 
process 
variable x1,x2,x3,x4,x5,x6,x7,x8,yl,y2,y3,y4,y5,y6,y7,y8: 
bit_vector(15 downto 0); 
variable i : integer := 0; 
begin 
wait until CLK’event and CLK = ’1’; 
x1 := al; x2 := a2; x3 := a3; x4 := a4; 
x5 := a5; x6 := a6; x7 := a7; x8 := a8; 
if x1(15)=’0’ then 
y1(13) := x1(15); y1(12) := x1(14); y1(11) := x1(13); 
y1(10) := x1(12); y1(9) := X1(11); Y1(8) := X1(10); 
yl(7) := X1(9); y1(6) := x1(8)3 y1(S) := x1(7)3 
y1(4) := x1(6); y1(3) := x1(5); y1(2) := x1(4); 
y1l(1) := x1(3)3 y1(O) := x1(2); y1(14) := ’0’; 
y1(15) := 0’; 
else 
yl(13) := x1(15); y1(12) := x1(14); y1(11) := x1(13); 
y1(10) := x1(12); y1(9) := X1(11)3 Y1(8) := X1(10); 
yl(7) := X1(9); y1(6) := x1(8)3 y1(S) := x1(7); 
y1(4) := x1(6); y1(3) := x1(5)3; y1(2) = x1(4); 


il 


yl(1) := x1(3); y1(0) := x1(2); y1(14) := 1’ 
y1(15) s= ’1’; 
end if; 


if x2(15)=’0’ then 
y2(13) := x2(15)3 y2(12) := x2(14); y2(11) := x2(13); 
y2(10) := x2(12); y2(9) := X2(11); Y2(8) := X2(10); 
y2(7) := X2(9)3 y2(6) := x2(8); y2(5) := x2(7); 
y2(4) := x2(6); y2(3) := x2(5); y2(2) := x2(4); 
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y2(1) := x2(3); y2(0) = x2(2); y2(14) := 70’; 
y2(15) := ’0’; 

else 
y2(13) := x2(15); y2(12) := x2(14)3 y2(11) := x2(13); 
y2(10) := x2(12); y2(9) := X2(11); Y2(8) := X2(10); 
y2(7) := X2(9); y2(6) := x2(8); y2(S) := x2(7); 
y2(4) := x2(6); y2(3) := x2(5); y2(2) := x2(4); 
y2(1) := x2(3); y2(0) := x2(2); y2(14) := 1’; 
y2(15) := 71’; 

end if; 

if x3(15)=’0’ then 
y3(13) := x3(15); y3(12) := x3(14); y3(11) := x3(13); 
y3(10) := x3(12); y3(9) := X3(11); y3(8) := x3(10); 
y3(7) := X3(9); y3(6) := x3(8); y3(5) := x3(7); 
y3(4) := x3(6); y3(3) := x3(5); y3(2) := x3(4); 
y3(1) := x3(3); y3(0) := x3(2); y3(14) := ’0’; 
y3(15) := 70’; 

else 
y3(13) := x3(15); y3(12) := x3(14); y3(11) := x3(13); 
y3(10) := x3(12); y3(9) := X3(11); Y3(8) := X3(10); 
y3(7) := X3(9); y3(6) := x3(8); y3(5) := x3(7); 
y3(4) := x3(6); y3(3) := x3(5); y3(2) := x3(4); 
y3(1) <= x3(3); y3(0) := x3(2); y3(14) := °1’; 
y3(15) := 71’; 

end if; 

if x4(15)=’0’ then 
y4(13) := x4(15); y4(12) := x4(14); y4(11) := x4(13); 
y4(10) := x4(12); y4(9) s= X4(11); y4(8) := x4(10); 
y4(7) := X4(9)5 y4(6) := x4(8); y4(S) := x4(7); 
y4(4) := x4(6); y4(3) := x4(5); y4(2) := x4(4); 
y4(1) := x4(3)5 y4(0) := x4(2); y4(14) := 0’; 
y4(15) := ’0’; 

else 
y4(13) := x4(15); y4(12) := x4(14); y4(11) := x4(13); 
y4(10) := x4(12); y4(9) := X4(11); Y4(8) := X4(10); 
y4(7) := X4(9); y4(6) := x4(8)5 y4(S) := x4(7)5 
y4(4) := x4(6); y4(3) := x4(5)5 y4(2) := x4(4); 
y4(1) := x4(3); y4(0) s= x4(2); y4(14) = 1’; 
y4(15) := ’1’; 
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if x5(15)=°0’ then 
y5(13) := x5(15)3 y5(12) := x5(14)3 yS(11) := x5(13); 
y5(10) := x5(12); y5(9) := X5(11); y5(8) := x5(10); 
y5(7) := X5(9)3 y5(6) = x5(8)3 yS(S5) = x5(7); 
y5(4) := x5(6); y5(3) := x5(5)3 y5(2) := x5(4); 
y5(1) <= x5(3); y5(0) := x5(2); y5(14) := 0’; 
y5(15) := °0’; 

else 
y5(13) := x5(15)3 y5(12) = x5(14); y5(11) := x5(13); 
y5(10) := x5(12); y5(9) := x5(11); y5(8) := x5(10); 
y5(7) := X5(9); y5(6) = x5(8); y5(5) := x5(7); 
y5(4) := x5(6); y5(3) := x5(5)3 yS(2) := x5(4); 
y5(1) == x5(3)3 y5(0) := x5(2); y5(14) s= °1’; 
y5(15) := °1’; 


if x6(15)=’0’ then 
y6(13) := x6(15); y6(12) := x6(14); y6(11) := x6(13); 
y6(10) := x6(12); y6(9) := X6(11); y6(8) := x6(10); 
y6(7) := X6(9); y6(6) := x6(8); y6(S) := x6(7); 
y6(4) := x6(6); y6(3) := x6(5); y6(2) := x6(4); 
y6(1) := x6(3); y6(0) := x6(2); y6(14) := ’0’; 
y6(15) := °0’; 

else 
y6(13) := x6(15); y6(12) := x6(14); y6(11) := x6(13); 
y6(10) := x6(12); y6(9) := x6(11); y6(8) := x6(10); 
y6(7) := X6(9); y6(6) := x6(8); y6(S) := x6(7); 
y6(4) := x6(6); y6(3) := x6(5); y6(2) := x6(4); 
y6(1) := x6(3); y6(0) := x6(2); y6(14) := 1’; 
y6(15) := °1’; 


if x7(15)=’0’ then 
y7(13) := x7(15)3 y7(12) := x7(14)3 y7(11) := x7(13); 
y7(10) := x7(12); y7(9) := X7(11)3 y7(8) := x7(10); 
y7(7) := X7(9); y7(6) = x7(8)3 y7(5) := x7(7)3 


y7(4) := x7(6); y7(3) <= x7(5)3 y7(2) := x7(4); 
y7(1) := x7(3)3 y7(O0) := x7(2); y7(14) s= 0’; 
y7(15) := ’0’; 

else 


y7(13) := x7(15)3 y7(12) := x7(14); y7(11) := x7(13); 
y7(10) := x7(12); y7(9) := x7(11)$ y7(8) := x7(10); 
y7(7) := X7(9); y7(6) := x7(8); y7(5) := x7(7)3 
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y7(4) = x7(6); y7(3) := x7(5); y7(2) = x7(4); 
y7(1) := x7(3)3 y7(0) := x7(2); y7(14) := ’1’; 


if x8(15)=’0’ then 
y8(13) := x8(15); y8(12) := x8(14); y8(11) := x8(13); 
y8(10) := x8(12); y8(9) := X8(11); y8(8) := x8(10); 
y8(7) := X8(9); y8(6) := x8(8); y8(S) := x8(7); 
y8(4) := x8(6); y8(3) := x8(5); y8(2) := x8(4); 
y8(1) := x8(3); y8(0) := x8(2); y8(14) := 70’; 
y8(15) := ’0’; 

else 
y8(13) := x8(15); y8(12) := x8(14); y8(11) := x8(13); 
y8(10) := x8(12); y8(9) := x8(11); y8(8) := x8(10); 
y8(7) := X8(9); y8(6) := x8(8); y8(5) := x8(7); 
y8(4) := x8(6); y8(3) := x8(5); y8(2) := x8(4); 
y8(1) := x8(3); y8(0) := x8(2); y8(14) := 71’; 
y8(15) := 1’; 


srl <= yl; sr2 <= y2;sr3 <= y3;sr4 <= y4; 
sr5 <= y5; sr6 <= y6; sr7 <= y7; sr8 <= y8; 
i:= i+1; 
if i = 6 then 
bl <= yl; b2 <= y2; b3 <= y3; b4 <= y4; 
b5 <= y5; b6 <= y6; b7 <= y7; b8 <= y8; 
xl := clr; x2 := clr; x3 := clr; x4 := clr; 
x5 := clr; x6 := clr; x7 := clr; x8 := clr; 
srl <= clr; sr2 <= clr; sr3 <= clr; srd <= clr; 
sr5 <= clr; sr6 <= clr; sr7 <= clr; sr8 <= clr; 
i:= 0; 
end if; 

wait on al,a2,a3,a4,a5,a6,a7,a8,clr,CLK; 

end process; 

end beh; 


entity result is 
port(al ,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
k : out bit_vector(15 downto 0);CLK : bit); 
end result; 
architecture beh of result is 
type r is array(0 to 7) of bit_vector(15 downto 0); 


Ne 


begin 


process 
variable x : r; 
begin 
x(0) := al; x(1) s= a2; x(2) := a3; x(3) := a4; 
x(4) := a5; x(5) := a6; x(6) := a7; x(7) := a8; 


for iin 0 to 7 loop 
wait until CLK’event and CLK = ’1’; 
k <= x(i); 
end loop; 
wait on al,a2,a3,a4,a5,a6,a7,a8,CLK; 
end process; 
end beh; 
faeee cane een nennnnn- Test bench -------------------- 
use work. pack1.all; 
entity test is end test; 
architecture str of test is 
component clock_ge port(CLCK :inout bit); 
end component; 
component clock port(CLK :inout bit); 
end component; 
component control port(CLK : bit;ct : out bit); 
end component; 
component LOAD port(AI : in bit_vector(11 downto 0); 
BO, B1,B2,B3,B4,B5,B6,B7 : out bit_vector(11 downto 0); 
CLK : in bit); 
end component; 
component shift 
port(bi0, bil, bi2,bi3,bi4,bi5,bi6,bi7: in bit_vector(11 downto 0); 
bo0,bo1,bo2,b03,b04,bo05,b06,bo7: out bit_vector(1 downto 90); 
CLK : in bit); 
end component; 
component adsu 
port(a0,al,a2,a3,a4,a5,a6,a7 : bit_vector(1 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(1 downto 0); 
CLK,cr,st : bit); 
end component; 
component reg 
port(a0,al1,a2,a3,a4,a5,a6,a7 : bit_vector(1 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(1 downto 90); 
CLK :;: bit); 
end component; 
component rom 
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port(e0,el ,e2,e3,e4,e5,e6,e7 : bit_vector(1 downto 0); 
b10,b11,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out bit_vector(15 downto 0); 
CLK : bit); 
end component; 
component shi_ 1 
port(f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16: 
bit_vector(15 downto 0); 
b10,b11,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out bit_vector(15 downto 0); 
CLK : bit); 
end component; 
component delayl | 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay2 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay3 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay4 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delayS 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay6 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay7 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay8 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay9 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay10 
port(a: bitsb: out bit;CLK: bit); 
end component; 
component add _ g 
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port(al,a2,a3,a4,a5,a6,a7,a8,a9,a10,al1,al2,a13,al4,a15,a16: 
bit_vector(15 downto 0); 
b1,b2,b3,b4,b5,b6,b7,b8 : out bit_vector(15 downto 0); 
CLK,as : bit); 
end component; 
component reg h 
port(a0,al,a2,a3,a4,a5,a6,a7 : bit_vector(15 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(15 downto 9); 
CLK : bit); 
end component; 
component add _1 
port(al ,a2,a3,a4,a5,a6,a7,a8,a9,a10,al1,a12,a13,a14,a15,a16: 
bit_vector(15 downto 0);b1,b2,b3,b4,b5,b6,b7,b8 : 
out bit_vector(15 downto 0);CLK : bit); 
end component; 
component shi_2 
port(al,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,b1,b2,b3,b4,b5,b6,b7,b8 : 
out bit_vector(15 downto 0);clr : bit_vector(15 downto 0); 
CLK : bit); 
end component; 
component result 
port(al,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
k : out bit_vector(15 downto 0); CLK : bit ); 
end component; 
for C: clock_ge use entity work.clock_ge(clk_ctl); 
for ad: clock use entity work.clock(beh); 
for a : control use entity work.control(beh); 
for L : LOAD use entity work. LOAD(BEH); 
for S : shift use entity work.shift(beh); 
for D : adsu use entity work.adsu(beh); 
for r : reg use entity work.reg(beh); 
for o : rom use entity work.rom(beh); 
for s 1 : shi_1 use entity work.shi_1(beh); 
for b : delayl use entity work.delay1 (beh); 
for e : delay2 use entity work.delay2(beh); 
for dely3 : delay3 use entity work.delay3(beh); 
for dely4 : delay4 use entity work.delay4(beh); 
for dely5 : delayS use entity work.delay5(beh); 
for dely6 : delay6 use entity work.delay6(beh); 
for dely7 : delay7 use entity work.delay7(beh); 
for dely8 : delay8 use entity work.delay8(beh); 
for dely9 : delay9 use entity work.delay9(beh); 
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for dely10 : delay10 use entity work.delay10(beh); 
for g : add_g use entity work.add_ g(beh); 
for h : reg h use entity work.reg h(beh); 
for i: add_i use entity work.add_i(beh); 
for j : shi_2 use entity work.shi_2(beh); 
for t ; result use entity work.result(beh); 
signal di : bit_vector(11 downto 0); 
signal ck : bit; 
signal clck : bit; 
signal go : bit; 
signal io : bit; 
signal ho : bit; 
signal te : bit; 
signal de : bit; 
signal ab ; bit; 
signal cd : bit; 
signal ef : bit; 
signal gh ; bit; 
signal ij : bit; 
signal kl : bit; 
signal d0,d1,d2,d3,d4,d5,d6,d7 : bit_vector(11 downto 0); 
Signal so0,sol,so2,s03,s04,s05,s06,so7 : bit_vector(1 downto 0); 
signal co0,col,co2,co3,co4,c05,c06,co7 : bit_vector(1 downto 0); 
signal do0,dol1,do2,d03,do04,d05,d06,do7 : bit_vector(1 downto 0); 
signal clr : bit :=’0’; 
Signal set : bit :=’0’; 
signal el,e2,e3,e4,e5,e6,e7,e8,e9,e10,e11,e12,e13,e14,e15,e16 : 
bit_vector(15 downto 0); 
signal f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16: 
bit_vector(15 downto 0); 
signal g1,22,23,24,25,26,27,28 : bit_vector(15 downto 0); 
signal h1,h2,h3,h4,h5,h6,h7,h8 : bit_vector(15 downto 0); 
signal 11,12,13,14,15,16,i17,18 : bit_vector(15 downto 0); 
signal j1,j2,j3,j4,j5,j6,j7,j8 : bit_vector(15 downto 0); 
signal rl,r2,r3,r4,r5,r6,r7,r8 : bit_vector(15 downto 0); 
signal cr : bit_vector(15 downto 0) := "0000000000000000"; 
signal p : bit_vector(15 downto 0); 
begin 
C : clock_ge port map(ck); 
ad : clock port map(click); 
a : control port map(ck,go); 
b : delayl port map(go,io,ck); 
e : delay2 port map(ck,ho,clck); 
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dely3 : delay3 port map(ho,te,clck); 
dely4 : delay4 port map(te,de,clck); 
dely5 : delayS port map(de,ab,clck); 
dely6 : delay6 port map(ab,cd,clck); 
dely7 : delay7 port map(cd,ef,clck); 
dely8 : delay8 port map(ef,gh,clck); 
dely9 : delay9 port map(gh,ij,clck); 
dely10 : delay10 port map(ij,kl,clck); 
L : LOAD port map(di,d0,d1,d2,d3,d4,d5,d6,d7,ck); 
S : shift port map(d0,d1,d2,d3,d4,d5,d6,d7, 
s00,s01,s02,Ss03,s04,s05,s06,s07,ck); 
D : adsu port map(so0,sol1,so2,s03,s04,s05,s06,s07, 
co0,col,co2,co3,c04,c05,c06,c07, 
ck,clr,set); 
r ; reg port map(co0,col,co2,co3,co4,co05,c06,c07, 
do0,dol,do2,d03,d04,d05,d06,do7, 
ck); 
0: rom port map(do0,dol,do2,d03,d04,d05,d06,do7, 
e1,e2,e3,e4,e5,e6,e7,e8,e9,e10,e11,e12,e13,e14, 
e15,e16,ck); 
1: shi_1 port map(el,e2,e3,e4,e5,e6,e7,e8 ,e9,e10,e11,e12,e13,e14,e15,e16, 
f1,f2,f3,f4,f5,f6,f7,£8,f9,f10,f11,f12,f13,f14,f15,f16, 
ck); 
g: add g port map(f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16, 
21,22,23,24,25,26,27,28,ck,i0); 
h ; reg h port map(g1,g2,23,2¢4,25,26,27,28,h1,h2,h3,h4,h5,h6,h7,h8,ck); 


i; add i port map(hl,rl,h2,r2,h3,r3,h4,r4,h5,r5,h6,r6,h7,r7,h8,r8, 
11 12 ,13,14,15,16,17,18 ck); 
j: shi_2 port map(il,i2,i3,i4,i5,i16,i7,i8,r1,r2,r3,r4,r5,r6,r7,r8, 
j1,j2,J3,J4,J5,J6,j7,J8,cr, kl); 
t : result port map(j1,j2,j3,j4,j5,j6,j7,j8,p,ck); 
set <= ’1’ after 5 ns; 
di <= "000101101010" after 7 ns, 
"000000000000" after 17 ns, 
"000101101010" after 27 ns, 
"001011010100" after 37 ns, 
"000101101010" after 47 ns, 
"000000000000" after 57 ns, 
"000101101010" after 67 ns, 
"001011010100" after 77 ns; 
end str; 
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APPENDIX B. 16-BIT 1-D DCT VHDL SOURCE CODE 


entity shi 2 is 
port(al ,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,b1,b2,b3,b4,b5,b6,b7,b8 : 
out bit_vector( 15 downto 0);clr : bit_vector(15 downto 0); 
CLK : bit); 
end shi 2; 
architecture beh of shi_2 is 
begin 
process 
variable x1,x2,x3,x4,x5,x6,x7,x8,yl,y2,y3,y4,y5,y6,y7,y8: 
bit_vector(15 downto 0); 
variable i : integer := 0; 
begin 
wait until CLK’event and CLK = ’1’; 
xl := al; x2 := a2; x3 := a3; x4 := a4; 
x5 := a5; x6 := a6; x7 := a7; x8 := a8; 
if x1(15)=’0’ then 
y1(13) := x1(15); y1(12) := x1(14); y1(11) := x1(13); 
y1(10) := x1(12); y1(9) := X1(11); Y1(8) := X1(10); 
y1(7) := X1(9); y1(6) := x1(8); y1(S) := x1(7); 
y1(4) := x1(6); y1(3) := x1(5); y1(2) := x1(4); 
y1(1) := x1(3); y1(0) := x1(2); y1(14) := ’0’5 
y1(15) := ’0’; 
else 
y1(13) := x1(15); y1(12) := x1(14); y1(11) := x1(13); 
y1(10) := x1(12); y1(9) := X1(11); Y1(8) := X1(10); 
= X1(9); y1(6) := x1(8); y1(S) := x1(7); 
y1(4) := x1(6); y1(3) := x1(5)5 y1(2) := x1(4); 


y1(1) := x1(3); y1(0) := x1(2); y1(14) := °1’; 
y1(15) = 1’; 
end if; 


if x2(15)=’0’ then 
y2(13) := x2(15); y2(12) := x2(14); y2(11) := x2(13); 
y2(10) := x2(12)3; y2(9) := X2(11); Y2(8) := X2(10); 
y2(7) := X2(9)3 y2(6) := x2(8); y2(S) := x2(7); 
y2(4) := x2(6); y2(3) = x2(5); y2(2) := x2(4); 
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y2(1) >= x2(3); y2(0) := x2(2); y2(14) := 0’; 
y2(15) := ’0’; 
else 
y2(13) := x2(15); y2(12) := x2(14); y2(11) := x2(13); 
y2(10) := x2(12); y2(9) := X2(11); Y2(8) := X2(10); 
y2(7) := X2(9); y2(6) = x2(8); y2(5) := x2(7); 
y2(4) := x2(6); y2(3) := x2(5); y2(2) := x2(4); 
y2(1) := x2(3); y2(0) := x2(2); y2(14) := 71’; 
y2(15) := 1’; 


if x3(15)=’0’ then 
y3(13) := x3(15)3 y3(12) := x3(14)3; y3(11) := x3(13); 
y3(10) := x3(12); y3(9) := X3(11); y3(8) := x3(10); 
y3(7) := X3(9); y3(6) := x3(8); y3(5) := x3(7); 
y3(4) := x3(6); y3(3) := x3(5)3 y3(2) := x3(4); 
y3(1) := x3(3)3; y3(0) := x3(2)3 y3(14) := 0’; 
y3(15) := 70’; 

else 
y3(13) := x3(15); y3(12) := x3(14); y3(11) := x3(13); 
y3(10) := x3(12)3; y3(9) := X3(11); Y3(8) := X3(10); 
y3(7) := X3(9); y3(6) = x3(8); y3(S) := x3(7); 
y3(4) := x3(6)$ y3(3) := x3(5)3 y3(2) := x3(4); 
y3(1) := x3(3); y3(0) := x3(2); y3(14) := ’1’; 
y3(15) := 7°15 


if x4(15)=’0’ then 
y4(13) := x4(15); y4(12) := x4(14); y4(11) := x4(13); 
y4(10) := x4(12); y4(9) := X4(11); y4(8) := x4(10); 
y4(7) := X4(9); y4(6) := x4(8); y4(5) := x4(7); 


y4(4) := x4(6); y4(3) := x4(5)3 y4(2) := x4(4); 
y4(1) := x4(3); y4(0) := x4(2); y4(14) := °0’; 
y4(15) := 70°; 

else 


y4(13) := x4(15)3 y4(12) := x4(14); y4(11) s= x4(13); 
y4(10) := x4(12); y4(9) := X4(11); Y4(8) := X4(10); 
y4(7) := X4(9); y4(6) := x4(8); y4(S) s= x4(7); 

s= x4(6)3 y4(3) := x4(5)3 y4(2) := x4(4); 
y4(1) s= x4(3); y4(0) s= x4(2); y4(14) s= 1’; 
y4(15) := °1’; 
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if x5(15)=’0’ then 
y5(13) := x5(15); y5(12) := x5(14); y5(11) := x5(13); 
y5(10) := x5(12); y5(9) := X5(11); y5(8) := x5(10); 
y3(7) = X5(9)5 y5(6) <= x5(8); y5(5) := x5(7); 
y5(4) := x5(6); y5(3) := x5(5); y5(2) := x5(4); 
y5(1) := x5(3); y5(0) := x5(2); y5(14) := 70’; 
y5(15) := ’0’; 

else 
y5(13) := x5(15); y5(12) := x5(14); yS(11) := x5(13); 
y5(10) := x5(12); y5(9) := x5(11); y5(8) := x5(10); 
y5(7) := X5(9); y5(6) := x5(8); y5(5) := x5(7)3 
y5(4) := x5(6); y5(3) := x5(5); y5(2) := x5(4); 
y5(1) := x5(3)3 y5(0) := x5(2); y5(14) := ’1’; 
y5(15) := °1’; 


if x6(15)=’0’ then 
y6(13) := x6(15); y6(12) := x6(14); y6(11) := x6(13); 
y6(10) := x6(12); y6(9) := X6(11); y6(8) := x6(10); 
y6(7) := X6(9); y6(6) := x6(8); y6(S) := x6(7); 
y6(4) := x6(6); y6(3) := x6(5); y6(2) := x6(4); 
y6(1) := x6(3); y6(0) := x6(2); y6(14) := ’0’; 
y6(15) := ’0’; 
else 
y6(13) := x6(15); y6(12) := x6(14); y6(11) := x6(13); 
y6(10) := x6(12); y6(9) := x6(11); y6(8) := x6(10); 
= X6(9); y6(6) := x6(8); y6(S) := x6(7); 
y6(4) := x6(6); y6(3) := x6(S); y6(2) := x6(4); 
y6(1) := x6(3); y6(0) := x6(2); y6(14) := °1’; 
y6(15) := 1’; 


if x7(15)=’0’ then 
y7(13) := x7(15)3 y7(12) := x7(14)3 y7(11) := x7(13); 
y7(10) := x7(12); y7(9) := X7(11); y7(8) := x7(10); 
y7(7) := X7(9); y7(6) := x7(8); y7(S) := x7(7)5 


y7(4) := x7(6); y7(3) = x7(5)3 y7(2) = x7(4); 
y7(1) := x7(3)3 y7(0) := x7(2); y7(14) := 70’; 
y7(15) := °0°; 

else 


y7(13) := x7(15); y7(12) := x7(14)3 y7(11) := x7(13); 
y7(10) := x7(12); y7(9) := x7(11)3 y7(8) := x7(10); 
y7(7) := X7(9); y7(6) := x7(8); y7(S) := x7(7)5 


107 


y7(4) := x7(6); y7(3) := x7(5)3 y7(2) := x7(4); 
y7(1) := x7(3); y7(0) := x7(2); y7(14) := 713 
y7(1$) := °1’; 
end if; 
if x8(15)=’0’ then 
y8(13) := x8(15); y8(12) := x8(14); y8(11) := x8(13); 
y8(10) := x8(12); y8(9) := X8(11); y8(8) := x8(10); 
y8(7) := X8(9); y8(6) := x8(8); y8(5) := x8(7); 
y8(4) := x8(6); y8(3) := x8(5); y8(2) := x8(4); 
y8(1) := x8(3); y8(0) := x8(2); y8(14) := ’0’; 
y8(15) := ’0’; 
else 
y8(13) := x8(15); y8(12) := x8(14); y8(11) := x8(13); 
y8(10) := x8(12); y8(9) := x8(11); y8(8) := x8(10); 
y8(7) := X8(9); y8(6) := x8(8); y8(5) := x8(7); 
y8(4) := x8(6); y8(3) := x8(5); y8(2) := x8(4); 
y8(1) := x8(3); y8(0) := x8(2); y8(14) := °1’; 
y8(15) := °1’; 


srl <= yl; sr2 <= y2; sr3 <= y3;sr4 <=y 
sr5 <= y5; sr6 <= y6; sr7 <= y7; sr8 <=y 
1:= i+]; 
if i = 8 then 
bl <= yl; b2 <= y2; b3 <= y3; b4 <= y4; 
b5 <= y5; b6 <= y6; b7 <= y7; b8 <= y8; 
x1 := clr; x2 := clr; x3 := clr; x4 := clr; 
x5 := clr; x6 := clr; x7 := clr; x8 := clr; 
srl <= clr; sr2 <= clr; sr3 <= clr; sr4 <= clr; 
sr5 <= clr; sr6 <= clr; sr7 <= clr; sr8 <= clr; 
is= 0; 
end if; 
wait on al,a2,a3,a4,a5,a6,a7,a8,clr,CLK; 
end process; 
end beh; 
eoecnn anna nnn -----=-- Test bench ------------------------ 
use work. pack1.all; 
entity test is end test; 
architecture str of test is 
component clock_ge port(CLCK :inout bit); 
end component; 
component clock port(CLK :inout bit); 
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end component; 


component control port(CLK : bit;ct : out bit); 
end component; 
component LOAD port(AI : in bit_vector(15 downto 0); 
B0,B1,B2,B3,B4,B5,B6,B7 : out bit_vector(15 downto 0); 
CLK :: in bit); 
end component; 
component shift 
port(bi0, bil ,bi2,bi3,bi4,bi5,bi6,bi7: in bit_vector(15 downto 0); 
bo0,bol,bo2,bo03,b04,bo5,b06,bo7 : out bit_vector(1 downto 0); 
CLK : in bit); 
end component; 
component adsu 
port(a0,al,a2,a3,a4,a5,a6,a7 : bit_vector(1 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(1 downto 0); 
CLK, cr,st : bit); 
end component; 
component reg 
port(a0,al,a2,a3,a4,a5,a6,a7 : bit_vector(1 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(1 downto 0); 
CLK : bit); 
end component; 
component rom 
port(e0,e1,e2,e3,e4,e5,e6,e7 : bit_vector(1 downto 0); 
b10,b11,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out bit_vector(15 downto 0); 
CLK : bit); 
end component; 
component shi 1 
port(f1,f2,f3,f4,f5,f6,f7,f8 ,f9,f10,f11,f12,f13,f14,f15,f16: 
bit_vector(15 downto 0); 
b10,b11,b20,b21,b30,b31,b40,b41,b50,b51,b60,b61,b70,b71,b80,b81: 
out bit_vector(15 downto 0); 
CLK : bit); 
end component; 
component delayl 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay2 
port(a: bitsb: out bit;CLK: bit); 
end component; 
component delay3 
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port(a: bit;b: out bit;CLK: bit); 
end component; 


component delay4 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay15 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay16 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay17 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component delay18 
port(a: bit;b: out bit;CLK: bit); 
end component; 
component add _ g 
port(al ,a2,a3,a4,a5,a6,a7,a8,a9,a10,all,al2,al3,al4,a15,a16: 
bit_vector(15 downto 90); 
b1,b2,b3,b4,b5,b6,b7,b8 : out bit_vector(15 downto 0); 
CLK,as : bit); 
end component; 
component reg h 
port(a0,al ,a2,a3,a4,a5,a6,a7 : bit_vector(15 downto 0); 
b0,b1,b2,b3,b4,b5,b6,b7 : out bit_vector(15 downto 0); 
CLK : bit); 
end component; 
component add i 
port(al ,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,al4,al5,al6: 
bit_vector(15 downto 0);b1,b2,b3,b4,b5,b6,b7,b8 : 
out bit_vector(15 downto 0);CLK : bit); 
end component; 
component shi_2 
port(al,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
srl,sr2,sr3,sr4,sr5,sr6,sr7,sr8,b1,b2,b3,b4,b5,b6,b7,b8 : 
out bit_vector(15 downto 0);clr : bit_vector(15 downto 9); 
CLK : bit); 
end component; 
component result 
port(al ,a2,a3,a4,a5,a6,a7,a8 : bit_vector(15 downto 0); 
k : out bit_vector(15 downto 0); CLK : bit ); 
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end component; 

for C: clock_ge use entity work.clock_ge(clk_ctl); 

for ad: clock use entity work.clock(beh); 

for a : control use entity work.control(beh); 

for L : LOAD use entity work. LOAD(BEH); 

for S : shift use entity work.shift(beh); 

for D : adsu use entity work.adsu(beh); 

for r : reg use entity work.reg(beh); 

for o : rom use entity work.rom(beh); 

for s_1 : shi_1 use entity work.shi_1(beh); 

for b : delayl use entity work.delayl (beh); 

for e : delay2 use entity work.delay2(beh); 

for dely3 : delay3 use entity work.delay3(beh); 

for dely4 : delay4 use entity work.delay4(beh); 

for dely15 : delay15 use entity work.delay15(beh); 

for dely16 : delay16 use entity work.delay16(beh); 

for dely17 ; delay17 use entity work.delay17(beh); 

for dely18 : delay18 use entity work.delay18(beh); 

for g : add _g use entity work.add_g(beh); 

for h : reg h use entity work.reg h(beh); 

for i : add_i use entity work.add_i(beh); 

for j : shi_2 use entity work.shi_2(beh); 

for t ; result use entity work.result(beh); 

Signal di: bit_vector(15 downto 0); 

signal ck : bit; 

signal clck : bit; 

signal go : bit; 

signal io : bit; 

signal ho : bit; 

signal te : bit; 

signal de : bit; 

signal op,qr,st,eo,ko,mo,qo,ro,so,uo : bit; 

signal d0,d1,d2,d3,d4,d5,d6,d7 : bit_vector(15 downto 0); 

Signal so0,so1,s02,s03,s04,s05,s06,so7 : bit_vector(1 downto 0); 

signal co0,col,co2,co3,c04,co5,c06,co7 : bit_vector(1 downto 0); 

signal do0,dol,do2,d03,d04,do05,do06,do7: bit_vector(1 downto 0); 

signal clr ;: bit :=’0’; 

signal set : bit :=’0’; 

signal e1,e2,e3,e4,e5,e6,e7,e8,e9,e10,e11,e12,e13,e14,e15,e16 : 
bit_vector(15 downto 0); 

signal f1,f2,f3,f4,f5,f6,f7,f8 ,f9,f10,f11,f12,f13,f14,f15,f16: 
bit_vector(15 downto 0); 
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signal g1,g2,23,24,25,26,27,28 : bit_vector(15 downto 0); 

signal h1,h2,h3,h4,h5,h6,h7,h8: bit_vector(15 downto 0); 

signal i1,i2,i3,i14,i5,i6,17,i8 ; bit_ vector(15 downto 0); 

signal j1,j2,j3,j4,j5,j6,j7,j8 ; bit_vector(15 downto 0); 

signal rl,r2,r3,r4,r5,r6,r7,r8 : bit_vector(15 downto 0); 

signal cr : bit_vector(15 downto 0) := "0000000000000000"; 

signal p : bit_vector(15 downto 0); 

begin 

C : clock ge port map(ck); 

ad : clock port map(clck); 

a : control port map(ck, go); 

b : delayl port map(go,io,ck); 

e ; delay2 port map(ck,ho,clck); 

dely3 : delay3 port map(ho,te,clck); 

dely4 : delay4 port map(te,de,clck); 

dely15 : delay15 port map(io,eo,ck); 

dely16 : delay16 port map(eo,ko,ck); 

dely17 : delay17 port map(ko,mo,ck); 

dely18 : delay18 port map(mo,gqo,ck); 

L : LOAD port map(di,d0,d1,d2,d3,d4,d5,d6,d7,ck); 

S : shift port map(d0,d1,d2,d3,d4,d5,d6,d7, 

s00,so01,s02,s03,s04,s05,s06,s07,de); 

D : adsu port map(so0,sol,so2,s03,s04,s05,s06,s07, 

co0,col,co2,co3,co4,c05,c06,co7, 
ck, clr, set); 

r : reg port map(co0,col ,co2,co03,c04,c05,c06,co7, 
do0,dol,do2,do03,d04,do05,d06,do07, 
ck); 

o : rom port map(do0,dol,do2,do03,d04,d05,d06,do7, 
el,e2,e3,e4,e5,e6,e7,e8,e9,e10,e11,e12,e13,e14, 
e15,e16,ck); 

s 1: shi_1 port map(el,e2,e3,e4,e5,e6,e7,e8,e9,e10,e11,e12,e13,e14,e15,e16, 
f1,f2,f3,f4,f5,f6,f7,£8,f9,f10,f11,f12,f13,f14,f15,f16, 
ck); 

g : add g port map(f1,f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12,f13,f14,f15,f16, 

21,22,23,24,25,26,27,28,ck,qo); 

h : reg h port map(g1,g2,23,24,25,26,27,28,h1,h2,h3,h4,h5,h6,h7,h8,ck); 


i: add i port map(hl,rl1,h2,r2,h3,r3,h4,r4,h5,r5,h6,r6,h7,r7,h8,r8, 
11 ,12,13,14,15,16,17,18,ck); 

j: shi_2 port map(il,i2,i3,i4,i5,i6,i7,i8,r1,r2,r3,r4,r5,r6,r7,r8, 
j1,j2,j3,j4,j5,j6,j7,j8,cr,ho); 

t : result port map(jl,j2,j3,j4,j5,j6,j7,j8,p,ck); 
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set <= ’l]’ after 5 ns; 

di <= "0000010110101000" after 7 ns, 
"0000000000000000" after 17 ns, 
"0000010110101000" after 27 ns, 
"0000101101010000" after 37 ns, 
"0000010110101000" after 47 ns, 
"0000000000000000" after 57 ns, 
"0000010110101000" after 67 ns, 
"0000101101010000" after 77 ns; 

end str; 
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APPENDIX C. MATLAB PROGRAM OF DECIMAL-BINARY CONVERSION 


while 1 
x(1,16) = 0; 
y = input(’Please enter your number : ’); 
if y== 
break 
end 
disp(’ wait!’); 
ify > 0, 
x(1) = 0; 
else 
x(1) = 1; 
y = abs(y); 
end 
i = 2; 
for k = 1:15; 
ify > 1, 
x(i) = fix(y); 
y=y- x(i); 
else 
x(i) = 0; 
end 
y=2*y; 
i=it+1; 
end 
disp(x); 


end 
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APPENDIX D. STRUCTURAL 1-D DCT HAND CALCULATION 


1) —_eooon00e0000000 
+ 000000000000000 


0000000000000000 
@ 


4. 0000000000000000 
0101101010000010 


0101101010000010 


+ (000000000000000 


0101101010000010 


(3). 0101101 01000001 
+ 0101101010000010 
0111000100100010 

overflow 5. 0001011010100000 
000011111000010 
0010110101000001 

+ 0101101010000010 


! 


0111000100100010 


+ 0010000111110000 


1001001100010010 


(5) 00101 10101000001 
* 0000000000000000 
0001011010100000 
+ 0010010011000100 
0011101101100100 
0010110101000001 
0000000000000000 
0001011010100000 
+ 0000111011011001 
0010010101 111001 
0010110101000001 
+0000000000000000 
0001011010100000 
+ 0000100101011110 
0001111111111110 


~ 0000000000000000 
+ OOOOOMTTTT 
0000017197 111111 


Fig. 19 U0 hand calculation 


0101001000000011 
+ 0000000000000000 
010100100000001 1 
0001 100000000101 


+ 





0100010111111111 
+ 





101101001111111 
0001 100000000101 
+ 0011100111111101 


0100010111111111 
+ 


0101110010011110 


©) 


@) 


0011100111111101 
+ QQ01 100000000101 


+ Q001011100100111 


0011100111111101 
+ Q0Q01 100000000101 


+ 0001001 100001010 


0001 100000000101 
+ 0001 100000000101 


+ 000100100000001 1 


0001 100000000101 
- Q001100000000101 
1 
+ Q0001 10110000010 
0000000101111111 


Fig. 20 V1 hand calculation 
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(4) 9000000000000000 6) — 0001010001011101 
+ 0000000000000000__ 


+ 4100111011010111 
0000000000000000 1101100100000101 
(2) 0000000000000000 + 1111111000111011 
+ 1410001100110100__ 1101011101000000 
11410001100110100 6) — 0001010001011101 
+ ooooo00000000000 _ + 4100111014010111 _ 
() 1110001100110100 1101 100100000101 
1100111011010111 V3.—s«=-+.4414010144010000__ 
+ 0001010001011101__ 1100111011010104 
1111101111001000 (7) 1400111011010111 
+ 1411100011001101 + 4100111011010111 _ 
q) 1101001001010 1011011001000010 
1100111011010111 + 4111001110110101 
+ 0001010001011101__ 1010100111110111 
1111101111001000 (8) 1100111011010111 
+ 1411110100100101 _ = 4100141011010111 
1111100011101104 0001 100010010100 
+ 4410101001111101__ 
0000001 100010001 


Fig. 21 V3 hand calculation 
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(1) 9000000000000000 (5) 4101001010111111 


+ 0000000000000000__ + 0000000000000000__ 
0000000000000000 1110100101011111 
0000000000000000 + 0000011100010010 _ 
+ opono00000000000 _ 1111000001110001 
0000000000000000 (6) 1101001010111111 
(3) 0010110101000001 4 * 0000000000000 
+ 9000000000000000__ 1110100101011111 
0001011010100000 + 1411110000011100 _ 
+ 9000000000000000__ 1110010101111011 
0001011010100000 7) 0010110101000001 
(4) 0010110101000001 + 0q0000000000000 
+ onq0000000000000 0001011010100000 
0001011010100000 + 1111100101011110 
+ 0000010110101000 0000111111411110 
0001 110001001000 (8) 9000000000000000 
_= 9000000000000000_ 
0000000000000000 
+ 0000001111111111 
0000001111111111 


Fig. 22 U4 hand calculation 
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(1) 9000000000000000 5) 1111001001100101 


+ 000000000000000 + 0010000011011001 
0000000000000000 0001101000001011 
(2) 0000000000000000 + 0000000100101110 
+ 0001001100111110 _ 000110110011100 
0001001100111110 (6) 1111001001100101 
+ 9000000000000000__ + 0010000011011001 _ 
0001001100111110 0001 101000001011 
0010000011011001 v5 _+.0000011011001110 _ 
+ 1111001001100101 001000001101100 
0000001011010001 (7) 0010000011011001 
+ 0000010011001111 _ + 0010000011011001 
0000011110100000 0011000101000101 
0010000011011001 + 0000100000110110 _ 
+ 1111001001100101 0011100101111011 
0000001011010001 (8) 0010000011011001 
+ 0000000111101000 = 0010000011011001 
0000010010111001 1110111110010011 
+ 0000111001011110 
1111110111110001 


Fig. 23. V5 hand calculation 
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(1) eoooooooooco0000 ~=s«S).-s«1111010001110111 
+ ooo0000000000000 


+ 1111101100111001__ 
0000000000000000 1111010101110100 
(2) 0000000000000000 + 1141101101100100 
+ 1410111110110000 _ 1111000011011000 
1110111110110000 (6) 1111010001110111 
+ oooooonqoqq00000 =—é<‘<‘é + +4111101100111001:~— 
1110111110110000 1111010101110100 
(3) 4411401100111001 V7 + 1144140000110110 _ 
+ 1111010001110111 _ 14111000110101010 
111100100001001 1 (7) 44411101100111001 
+ 411411101111101100 -_ +. 1111101100111001 
1110110111111111 1111100011010101 
(4) 1111101100111001 + 4141140001101010__ 
+ 1111010001110114__ 1111010100111111 
1111001000010011 (8) 1111101100111001 
+ 1111101101111111 = _=_ 1111101100111001 
1110110110010010 0000001001 100011 
+ 4111110101001111 _ 
1111111110110010 


Fig. 24 V7 hand calculation 
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APPENDIX E. FORMATION OF 2-BIT ADDER 


. TWO BIT ADDER TRUTH TABLE 


Table XX Truth table of 2-bit adder 










ld ake 
‘Ea ate ee er Pa 
ae ee eos Po 


0 1 1 1 0 0 0 1 
Sa 
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Table XXI (Table XX) continue 








es Ed 
Mia ee 
re ee 


Two bit adder has five inputs, three outputs. A,, Ap, B,, and By represent the input 
and C, represents the carrier in. Q;, gy represent the output and C, represents the carrier 


out. After the set up of truth table, reduction can be made by Karnaugh map. 


| pap 


A181 BOG 
AWB 


Ai81 B00) 





Pr Co 
don 
Noll ie 
4 
NY es 
Fig. 25 Karnaugh map reduction 
Karnaugh map reduction gives the reduced boolean expression. 
q, = A,A,B,C,+A,B,B,C,+A,B, BoC ;+AAgB,C +4 AoB By tA AgB Bo 
+A A,B, By +A AyB,By+A AgB,C,+A B,ByC,+A,B,B,C;+4 AgB,C, (36) 


23 


Go = ApoB C; + A,B C, + ABC, + A,B,C. (37) 


C, = A,ApBy+AgB, By tA Ay; +A BoC ;+ BBC, +A ,B, +AyB,C; (38) 
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